Halite II post-mortem

Introduction

It was some hard work, but after 3 months I won the Halite 2 AI competition with a decisive lead, and I am very happy with that! It's definitely one of the toughest competitions I have participated in, but also a very fun one. Since Halite is a game with such a deep focus on strategy, for this post-mortem I will describe in details the inner workings of my bot, all the strategies it takes as well as the intent behind them. If there are other topics you would like to know more about, feel free to contact me, I might write more here.

If you are unfamiliar with the game rules, it will probably be easier to understand the rest of this article by reading the quick summary on their website first.


High level overview

To get a better idea of all the parts involved, here is what happens at a high level on every game turn:


Survival strategy

Because of the ranking implementation in 4p games, a player with no ships at the end of the game will always place after a player with ships remaining. So it is absolutely vital to abandon at the right time to maximize survivability and therefore placement, while still maximizing use of every ship before that.

All conditions must be met to trigger survival:

When in this mode, each ship iterates through surrounding points, picks the one that gives the largest closest enemy distance and navigates towards it. That generally gives good enough behavior. Survival mode is a terminal state, it does not bother checking if it could colonize again and make a comeback. I am not sure if it would make that much of a difference.


Early game strategy

Because the early game is so different and unforgiving, I isolated this part of the game in its own code path. The big two differences over the rest are choosing the initial planet(s) to settle on, and take into account potential rushes from the enemy, who attack straight away with their initial ships hoping for a quick win by catching bots off guard.

Colony planning

The choice of initial planet(s) is done once at game start, and is different between 2p games and 4p. The one for 2p is very old but I could not improve or find a replacement in time that seemed to perform as well. The idea is to control as much space as possible early on by choosing a planet with other planets nearby, to boost early production and be the first player to dominate the center, therefore breaking the symmetry, and typically winning the game.

4p games require another strategy, as the goal is quite different. A good spot is still desirable, but definitely not at the cost of being nearby enemy players. Otherwise, the risk of being rushed is greatly increased, and even with no rush, the risk of becoming the enemy's target of choice is also higher. Having two enemies nearby early on is almost guaranteed to turn into a bad game.

Early game execution

The early game plan aims to establish the initial colonies as fast as possible while also keeping early invaders at bay. It also tries to break a stalemate in case no one colonizes by engaging combat after a while, and rushes if it owns no planet but the enemy does.

The bot will stick to early game mode until one of these conditions is met:

One special note: ships not trying to colonize are always forced to stick together to hope for better combat outcome, as early game is too dangerous to go alone. However this later ended up as a shortcoming against bots splitting up to try to colonize a planet far away. I did not fix it in time as other issues were more pressing.


Game state masking

Another 4p only improvement, only enabled once the early game is over, this mask aims to force the bot to focus on one player instead of multiple at once. If a player can be eliminated more quickly, then its planets can be taken over faster and therefore create a bigger snowball effect.

When no target player is assigned, the enemy having a ship closest to an allied planet is considered as the target. The vertical enemy has a -50 bonus in order to pick it more often. The reasoning for that is most often other bots will engage vertically, so it is important to stick to vertical to stay 1v1 unless horizontal is much closer. Once a target player is assigned, it will stick until that player no longer controls any planet.

Because attack is almost entirely defined in function of docked ships (more on this later), all docked ships not owned by the bot or the target are removed from the map, unless an allied ship is less than 20 units away from it, or an allied planet less than 30 units away from it. This way, the offense is almost always focused on a single player.

On top of that, any unowned planet that is less than 40 units away from an enemy that is not the target is considered as an invalid target for colonization. This way it seeks to delay getting attention from another opponent while it conquers the target.

Despite the potential blindness and poor targeting I have seen sometimes, every time I tried removing this masking, my bot performed worse overall.


Strategic pass

This pass is looking to do high level decision making by assigning one of the following roles to every allied ship:

Either looping through targets to find a ship to assign to, or looping through ships to find a target, have their flaws as they will give poor solutions in some scenarios. It is important to minimize the sum of all distances between ships and their targets in order to respond faster overall. I thought of using an evolutionary search for that, but decided to start with a simpler algorithm at first and see if it needs replacement later. It survived in almost intact shape since.

The way the targeting, scoring and execution of a role is done is specific to each, so more details follow.

Colonize role

For the most part, the target computation of that role is straightforward. For a given ship, it just tries to find the closest planet with docking spots available. However, the twist is it will exclude planets that are considered unsafe to colonize. At some point I noticed my bot was wasting a lot of ships by docking them only to get destroyed a few turns later, so I tried to figure out how to prevent that.

The main idea to determine planet safety is making sure all nearby enemy ships can be dealt with in the future, by checking if for each one of them, an allied ship in proximity can reach that point in time.

The score given to that role is the distance to the planet's surface. A -40 bonus is added if the ship can already dock, to not get distracted when it's already almost colonized anyway.

The assignment of this role fails if the available docking spots are already reserved by other ships, or if the planet becomes unsafe due to assignment of other roles.

Attack/defense role

For good or bad reasons, I decided a ship cannot be a candidate for both attack and defense at once. I thought it would be easier to balance priorities between colonizing vs fighting and attacking vs defending rather than everything at once, but cannot say whether it was better or not in the end.

For defense, only enemy ships near an allied docked ship are considered. For attack, for a long while all enemy ships were considered. At some point I experimented with being exclusively focused on doing economic damage and found a remarkable improvement in behavior and ranking. Ever since, only enemy docked ships are considered for attack.

The target selection for offense or defense is:

The score given to that role is the raw distance to the target, with a -30 bonus given for defense.

The assignment of this role fails if more than one allied ship is assigned for defending against the same enemy ship. There used to be a limit for attacking the same enemy ship, but together with only considering enemy docked ships, removing it made the attack a lot more focused and therefore successful.

Defending a ship consists of positioning between the enemy ship and its closest allied docked ship. The preferred distance along this vector is 15 units ahead of the enemy ship, or 5 units ahead of the allied ship if it's closer.

Spawn prediction

An important detail of the strategic pass is it does a small alteration of the game state while computing roles. Given the current state, it looks at ships that will be produced by allied planets in the next 10 turns and adds them to the state using the position closest to map center regardless of ship proximity. Those ships will have a distance penalty assigned corresponding to 8 * numTurns. This means that every distance calculation in which these ships are involved get that penalty added as a distance.

The implications of this alteration are considerable. It frees up a number of current ships to perform better roles, leaving some for future ships instead. For instance, it might choose not to bother defending and attack instead, or not send a ship to colonize a planet that will produce a ship to colonize with faster.

No prediction was done for the enemy as no behavior or ranking improvement could be found, sometimes even being detrimental.


Tactical pass

Once every ship has been assigned a role, the tactical pass aims to use better moves than the one provided by the strategic pass, to still accomplish those roles but with better positioning.

Proper ship micromanagement makes a huge difference in Halite. Superior ship numbers in combat snowballs hard, avoiding useless fights lets ships focus on more important tasks and evading defenses can let ships harass better, etc. Given the combat mechanics it seemed really difficult to come up with a decision making structure similar to the other parts. Maybe some sort of tree search could help figure out the best movements, but the very large number of possible moves even with pruning, especially given the quantities of ships involved, did not seem feasible in any straightforward fashion.

So I started looking at a simpler version of the problem: if enemy movements are known in advance, how to position ships to take the most advantage of it? This was more or less my first iteration:

This gives a baseline for the current plan given by the strategic pass, which can then be iterated on to find moves giving a better score. I used hill climbing to refine the current plan:

With this iterative search the bot's behavior was already massively improved. Ships would avoid losing battles by moving out of the way, or commit together to ensure the battle result would get even better. Emergent behavior, such as low health ships attempting to ram into an enemy, would happen as long the net health result is better. Afterwards it was all about adding various improvements to this basic idea.

Enemy prediction

So far it's been assumed the enemy ship always move straight towards the closest target. It's obviously not something that happens very often, so moves can be made that are very weak to other enemy responses. I experimented quite a lot with various probability models and ended up settling with this one. A set of 19 global enemy responses is precomputed, and the evaluation of each resulting simulation is summed up to make a single score. In other words, each tested move results in 19 different simulations and making sure the score is better on average.

The composition of these 19 responses is:

For enemy ship movement, I also disabled ship-ship collisions if the two ships involved belong to the enemy, in order to spend less time solving it and overestimate the enemy's ability to group up together to get a more conservative prediction. Ship-planet collisions however are always maintained.

Grouping

Unlike many other top bots, I did not have any specific code forcing ships to move closer together, as it created weird or undesirable behaviors. However I still used a clustering method to help the hill climbing get out of some traps. One problem I noticed is when a smaller mass of ships is engaged in combat with a bigger one, the hill climbing fails to move them individually out of danger, because the evaluation function only sees the short term impact. It thinks keeping them in the action will minimize the losses, failing to see the overall battle is lost.

So in addition to trying random actions on individual ships, it also tries applying a random action to all ships of a group and examines the result in the same way. It works rather well even if the grouping method is very naive:

State evaluation

There are many issues with the basic evaluation to address to make the most out of the tactical search.

One big problem with just summing up allied health and comparing against enemy health is it will not mind trading ships for no reason, as long as there is no net loss. Since numeric superiority is so effective, it is important to prioritize ship preservation and only trade when it makes sense. Health multipliers are used to express these intents:

Another problem is ships entirely stop caring about the strategic plan if combat happens. The original target position must be taken into account to be as close to it as possible, but not at the expense of bad combat. So 1 unit of distance to the original target position is worth -1 point. One of the most noticeable impacts is ships attacking enemy docked ships now often move and dodge enemies, like rain on an umbrella. If the ship is moving to colonize a planet, each distance unit is worth -5 points instead, so it does not get too distracted by combat.

One last improvement to the evaluation is to use a non-linear function for health values, separating it into different tiers.

Health curve

static const int healthTiers[4] = { 160, 160 + 128, 160 + 128 + 96, 160 + 128 + 96 + 64 };
float health = (float)(ship.health + healthTiers[ship.health / 64]);
health /= (255 + healthTiers[3]);
health *= 255;

Overall, the goal is to give more importance to the number of single hits a ship can take before destruction, as its health greatly affect how it can be used. For instance, it's very good to trade a full hit between a max hp ship and an enemy ship almost destroyed. With the same reasoning, it's also good to distribute more evenly the health between different ships, as each will survive longer and therefore output more damage in a fight.

Also, a different evaluation function is used when the survival mode is engaged, where only allied ships health is considered, to maximize survivability.

Optimization and other improvements

Because of the sheer volume of simulations necessary to obtain a good tactical pass, I spent some time profiling and optimizing the code to give good results in most situations. The biggest improvement by far was to limit the O(n^2) collision detection, which easily eat up the majority of time spent, by precomputing all pairs of entities that can potentially interact within the next turn, and loop through these pairs instead of (all ships x all ships).

Another easy improvement was to remove any ships not within a distance of 25 units from an enemy, as they cannot possibly have an impact over scoring.

Execution speed can still be an issue towards the mid-game of 4p games when the number of ships gets in the hundreds. Before trying random actions, moves from 0-360 degrees at full thrust for each ship are tried first: one loop for 45 degrees increments starting at 0, and another for 45 degrees increments starting at 22. That way bailing out due to timeout still gives good enough results.

Instead of selecting a random ship on each iteration, it is better to iterate through all ships and groups evenly to avoid the RNG being sometimes whimsical.

Instead of selecting a random thrust, it is also better to weigh it towards the circle area it corresponds to, that way less time is spent on thrust 1 and more on thrust 7. Thrust 0 can be tested once at the beginning. In other words:

int thrust = rnd.getInt(49);
if (thrust < 1) thrust = 1;
else if (thrust < 4) thrust = 2;
else if (thrust < 9) thrust = 3;
else if (thrust < 16) thrust = 4;
else if (thrust < 25) thrust = 5;
else if (thrust < 36) thrust = 6;
else thrust = 7;
int angle = rnd.getInt(360);
return move(thrust, angle);

Other thoughts

Navigation was a hot topic in chat but a good solution is rather trivial. The given method in starter kits is good enough, my only improvements were to force 1 degree increments, use the direction vector out of integer angle and thrust to have the real movement, and check alternately between angles x and -x to favor the shortest deviation. Collision detection is also trivial to solve by storing the direction vector every time a move is issued to a ship, and doing a segment-segment intersection check between the desired direction vector with the one of every other ship. If it fails, then keep iterating on the angles. There was no point complexifying this further as the tactical pass would mess up most of it anyway.

I have tried many times to come up with some kind of search for the strategic pass, to no avail. The main difficulty I encountered was to find a cheap but accurate model for predicting ship combat outcomes. Analyzing most of the resulting long term plans would come up with a lot of problems and over/underestimating combat situations. I am still a little annoyed about not making anything work in that area.

Local testing against previous versions was helpful in the beginning, but the exercise became increasingly inaccurate and pointless over time. I often had versions performing much better online while still performing poorly against the previous version. I am still hoping to eventually find a way to easily maintain enough diversity in the local pool to give more accurate results, as waiting for online results is tedious and slow.


Workflow and tools

There is tremendous value to be had in having a good development workflow and tools, I think this is often way too underestimated in competitions like these. Even with 3 months available, maximizing the value of the time spent pays off.

First, if you are not already using an IDE and a source control, please take the time to look into one, it's a ridiculously good payoff for the time invested. For Windows users I personally recommend using the free version of Visual Studio for C++ and C#, along with Git for source control.

Despite the excellent Chlorine third party viewer generously provided by fohristiwhirl, I chose early on to write my own replay viewer in order to have better control and integration with my development flow. Some of its features include:

Halite Battlestation

I also wrote a command-line tool to help with other tasks, such as:

Example of a progression chart:

Bot version chart

Example of a performance breakdown:

2p stats: 204-23 (227 games) 90-10%
mellendo v57     26-1        (1-26)
mlomb v36        23-2        (2-23)
Gadziferoth v67  22-2        (2-22)
FakePsyho v152   19-4        (4-19)
shummie v568     11-2        (2-11)
shummie v566     11-1        (1-11)
shummie v574     6-4         (4-6)
ipost v16        10-0        (0-10)
zxqfl v73        7-1         (1-7)
ewirkerman v78   6-0         (0-6)
4p stats: 179-47-39-21 (286 games) 63-16-14-7%
FakePsyho v152   62-26-14-10 (37-34-19-22)
Gadziferoth v67  61-18-12-3  (10-33-36-15)
mlomb v36        51-17-12-10 (3-22-29-36)
mellendo v57     41-9-12-9   (11-23-15-22)
zxqfl v73        34-12-7-5   (4-13-19-22)
shummie v568     34-6-5-4    (6-14-15-14)
shummie v574     16-7-3-4    (4-9-9-8)
ipost v16        16-8-4-1    (2-5-9-13)
ewirkerman v78   21-2-3-1    (1-8-10-8)
shummie v567     13-3-7-1    (5-3-7-9)

Halite feedback

The game overall was very well designed, striking a fine balance between complexity and depth. Part of me wishes maybe one or two more viable mechanics were available, to give a bit more variety in the possible strategies, but it's a tough call. Less is more and all that, but still.

There were still some significant issues in the design however. Rushing prevents too many games from ever starting, which brings a lot of volatility and annoying edge cases to deal with, not to mention it's a lot easier to write a good rush bot than defending well against one. I think the best way to solve this issue would be to make sure player spawns are always far enough. At the very least, the platform should take ties as a valid game result instead of randomly assigning a winner. This would have allowed a viable 2p rush defense strategy by stalling out and not engaging in order to tie and minimize the ranking loss for both players, not to mention eliminate more volatility in rankings.

I also have no idea why 2p maps are symmetric but not 4p ones, as it creates a lot of frustrating imbalance to deal with. Too often, one bot would start with one planet nearby and the other one with three, and given equivalent skill, the game was over before it even began. For any competitive game designer reading this, please always make sure every player starts on even footing no matter what.

The Halite platform is just fantastic overall, and having it open sourced to everyone is incredibly satisfying. I must underline how invaluable it was to have the backend API and replay format available, making so many tools and analysis possible. My improvement suggestion would be to add an API to play out a single match against chosen opponents on a specific map seed, in order to have a much easier time ironing out edge cases against specific bots and strategies (namely rush defense).


Conclusion

I cannot thank the Halite staff and Two Sigma enough, along with volunteer janzert, for the amazing work they have put into this competition. They have been incredibly active and responsive to answer everyone's questions and solve the occasional problems, and the overall quality of the competition just blows me away. The Halite community has also been very cool and friendly, it was a lot of fun interacting through the chat, so thank you all for being a part of that.

I am still on the fence about publishing the source code of my bot. I believe I have already explained most of all the interesting stuff in this article, and the rest is mainly a personal mess of a codebase that should definitely not be used as an example, even for prototyping. If I ever change my mind though I will update with a link here.

Thanks for reading this article through to the end. If you have any feedback, questions or comments, do not hesitate to reach me in the Halite's Discord or via PM on Discord (reCurse#5226), I will answer as best as I can!

– reCurse