Halite II post-mortem

Introduction

It was some hard work, but after 3 months I won the Halite 2 AI competition with a decisive lead, and I am very happy with that! It's definitely one of the toughest competitions I have participated in, but also a very fun one. Since Halite is a game with such a deep focus on strategy, for this post-mortem I will describe in details the inner workings of my bot, all the strategies it takes as well as the intent behind them. If there are other topics you would like to know more about, feel free to contact me, I might write more here.

If you are unfamiliar with the game rules, it will probably be easier to understand the rest of this article by reading the quick summary on their website first.

High level overview

To get a better idea of all the parts involved, here is what happens at a high level on every game turn:

(4p only) Survival strategy override when the game is considered lost, try to achieve the highest rank by surviving until the end.
Early game strategy override from the start until a planet is colonized and safe of invaders.
(4p only) Alter the game state to focus more on a single target.
If no strategy override is present, perform a generic strategic pass, assigning a high level goal and desired target position for every allied ship.
Perform a tactical pass, possibly modifying the action of every allied ship that could encounter an enemy, while trying to maintain it as close from the original target as possible.

Survival strategy

Because of the ranking implementation in 4p games, a player with no ships at the end of the game will always place after a player with ships remaining. So it is absolutely vital to abandon at the right time to maximize survivability and therefore placement, while still maximizing use of every ship before that.

All conditions must be met to trigger survival:

More than 2 players still having ships.
More than 60 turns have elapsed, to avoid early aggression being seen as apocalyptic.
For all owned planets, the number of enemy undocked ships exceeds the number of allied undocked ships by more than 3 in a radius of 50.

When in this mode, each ship iterates through surrounding points, picks the one that gives the largest closest enemy distance and navigates towards it. That generally gives good enough behavior. Survival mode is a terminal state, it does not bother checking if it could colonize again and make a comeback. I am not sure if it would make that much of a difference.

Early game strategy

Because the early game is so different and unforgiving, I isolated this part of the game in its own code path. The big two differences over the rest are choosing the initial planet(s) to settle on, and take into account potential rushes from the enemy, who attack straight away with their initial ships hoping for a quick win by catching bots off guard.

Colony planning

The choice of initial planet(s) is done once at game start, and is different between 2p games and 4p. The one for 2p is very old but I could not improve or find a replacement in time that seemed to perform as well. The idea is to control as much space as possible early on by choosing a planet with other planets nearby, to boost early production and be the first player to dominate the center, therefore breaking the symmetry, and typically winning the game.

Use the ship closest to the map center as reference point.
For each planet, find the one with largest score using:
- 6 points per docking spot.
- -20 points if planet has 2 docking spots.
- -1 point per unit of reference-planet center distance.
- For every planet nearby with center distance < 75: 3 points per docking spot, linearly scaled over distance to favor proximity.
Send all ships to that planet, even if only 2 spots are available, to better deal with rushes.

4p games require another strategy, as the goal is quite different. A good spot is still desirable, but definitely not at the cost of being nearby enemy players. Otherwise, the risk of being rushed is greatly increased, and even with no rush, the risk of becoming the enemy's target of choice is also higher. Having two enemies nearby early on is almost guaranteed to turn into a bad game.

Iterate over the 5 closest planets for every allied ship to create 125 combinations:
- Discard combinations where every ship goes to a different planet.
- Discard combinations where all ships go to a planet with 2 docking spots.
- For each ship of the combination:
  - -5 points per turn needed (ceilf(dist / 7)) for that ship to reach the target planet.
  - Find the closest distance between an enemy and the target planet center. 1 point per turn needed to reach that distance.
Keep the combination with the largest score and send ships to those planets.

Early game execution

The early game plan aims to establish the initial colonies as fast as possible while also keeping early invaders at bay. It also tries to break a stalemate in case no one colonizes by engaging combat after a while, and rushes if it owns no planet but the enemy does.

All ships are forced to navigate to their planet assigned by the colony planning until destination is reached (canDock() == true).
Once at destination, count undocked allied ships in a radius of 25 (allyCount), and undocked enemy ships in a radius of 85 (enemyCount). Only consider the vertical enemy in 4p games.
If allyCount > enemyCount, docking is authorized.
Otherwise, hold position if the closest enemy has a distance over min(80, 20 + round) and does not have a docked ship. This gives some patience if docking is forbidden to immediately dock if no danger is found a bit later.
The remaining ships are then directed towards the first match of the following:
- The closest allied docked ship for defense, if present.
- The closest enemy docked ship for rush, if present.
- The closest enemy ship for combat.
Docked ships will undock if there are more enemy undocked ships than allied undocked ships in a surrounding distance of 65 units.

The bot will stick to early game mode until one of these conditions is met:

At least 5 allied ships exist on the map.
No undocked enemy ship is found within 65 units of an allied docked ship.
No orders were given to any ship.

One special note: ships not trying to colonize are always forced to stick together to hope for better combat outcome, as early game is too dangerous to go alone. However this later ended up as a shortcoming against bots splitting up to try to colonize a planet far away. I did not fix it in time as other issues were more pressing.

Game state masking

Another 4p only improvement, only enabled once the early game is over, this mask aims to force the bot to focus on one player instead of multiple at once. If a player can be eliminated more quickly, then its planets can be taken over faster and therefore create a bigger snowball effect.

When no target player is assigned, the enemy having a ship closest to an allied planet is considered as the target. The vertical enemy has a -50 bonus in order to pick it more often. The reasoning for that is most often other bots will engage vertically, so it is important to stick to vertical to stay 1v1 unless horizontal is much closer. Once a target player is assigned, it will stick until that player no longer controls any planet.

Because attack is almost entirely defined in function of docked ships (more on this later), all docked ships not owned by the bot or the target are removed from the map, unless an allied ship is less than 20 units away from it, or an allied planet less than 30 units away from it. This way, the offense is almost always focused on a single player.

On top of that, any unowned planet that is less than 40 units away from an enemy that is not the target is considered as an invalid target for colonization. This way it seeks to delay getting attention from another opponent while it conquers the target.

Despite the potential blindness and poor targeting I have seen sometimes, every time I tried removing this masking, my bot performed worse overall.

Strategic pass

This pass is looking to do high level decision making by assigning one of the following roles to every allied ship:

Colonize a planet.
Defend an allied docked ship.
Attack an enemy docked ship.

Either looping through targets to find a ship to assign to, or looping through ships to find a target, have their flaws as they will give poor solutions in some scenarios. It is important to minimize the sum of all distances between ships and their targets in order to respond faster overall. I thought of using an evolutionary search for that, but decided to start with a simpler algorithm at first and see if it needs replacement later. It survived in almost intact shape since.

For all allied ships that are undocked:
- For each possible role, compute the best target for that role and give it a score.
- Push that role-score pair in a list.
Sort the list of all role-score pairs by their score (lower score = better).
Looping through the pairs of sorted list:
- If the ship of that pair already has a role, ignore and proceed to the next pair.
- Try to assign this role to the ship. If it fails (more on this later), compute a different target for that role and use its new score to insert it at the appropriate slot in the list.
- If assignment is successful, keep the desired target position for the tactical pass.

The way the targeting, scoring and execution of a role is done is specific to each, so more details follow.

Colonize role

For the most part, the target computation of that role is straightforward. For a given ship, it just tries to find the closest planet with docking spots available. However, the twist is it will exclude planets that are considered unsafe to colonize. At some point I noticed my bot was wasting a lot of ships by docking them only to get destroyed a few turns later, so I tried to figure out how to prevent that.

The main idea to determine planet safety is making sure all nearby enemy ships can be dealt with in the future, by checking if for each one of them, an allied ship in proximity can reach that point in time.

Loop through all enemy ships that are less than 45 units away of the docking spot:
- Compute the enemy distance to the docking spot.
- Find an allied undocked ship with a distance to the docking spot that is closest to the enemy's distance, but no more than 7 units above.
  - If no such ship is found, the planet is unsafe. Bail out.
- This allied ship is now excluded from further iterations of the loop.
If the loop was completed without bailing out, the planet is safe.

The score given to that role is the distance to the planet's surface. A -40 bonus is added if the ship can already dock, to not get distracted when it's already almost colonized anyway.

The assignment of this role fails if the available docking spots are already reserved by other ships, or if the planet becomes unsafe due to assignment of other roles.

Attack/defense role

For good or bad reasons, I decided a ship cannot be a candidate for both attack and defense at once. I thought it would be easier to balance priorities between colonizing vs fighting and attacking vs defending rather than everything at once, but cannot say whether it was better or not in the end.

For defense, only enemy ships near an allied docked ship are considered. For attack, for a long while all enemy ships were considered. At some point I experimented with being exclusively focused on doing economic damage and found a remarkable improvement in behavior and ranking. Ever since, only enemy docked ships are considered for attack.

The target selection for offense or defense is:

Looping through all enemy ships, find the one with closest distance according to:
- If it is undocked and less than 25 units away from its closest allied docked ship, this is for a defense role:
  - Use the distance to the allied ship for comparison.
  - Subtract by (20 - min(20, distToClosestEnemy)) * 2. This is done to favor allied docked units which are nearest to danger.
- If it is docked, this is for an attack role:
  - Use the distance to the enemy docked ship for comparison.
  - Count the number of undocked enemy ships in a distance of 30 around it. If the game is 2p, clamp that number to 4 max. In 4p, 2 max.
  - Subtract the distance by 30 * (2 - numDefenders). This is done to harass docked ships with less defense. The effect is more limited in 4p because faster elimination is better for winning the whole game.
- Otherwise, skip this ship.
The target selected determines whether the role will be attack or defense.

The score given to that role is the raw distance to the target, with a -30 bonus given for defense.

The assignment of this role fails if more than one allied ship is assigned for defending against the same enemy ship. There used to be a limit for attacking the same enemy ship, but together with only considering enemy docked ships, removing it made the attack a lot more focused and therefore successful.

Defending a ship consists of positioning between the enemy ship and its closest allied docked ship. The preferred distance along this vector is 15 units ahead of the enemy ship, or 5 units ahead of the allied ship if it's closer.

Spawn prediction

An important detail of the strategic pass is it does a small alteration of the game state while computing roles. Given the current state, it looks at ships that will be produced by allied planets in the next 10 turns and adds them to the state using the position closest to map center regardless of ship proximity. Those ships will have a distance penalty assigned corresponding to 8 * numTurns. This means that every distance calculation in which these ships are involved get that penalty added as a distance.

The implications of this alteration are considerable. It frees up a number of current ships to perform better roles, leaving some for future ships instead. For instance, it might choose not to bother defending and attack instead, or not send a ship to colonize a planet that will produce a ship to colonize with faster.

No prediction was done for the enemy as no behavior or ranking improvement could be found, sometimes even being detrimental.

Tactical pass

Once every ship has been assigned a role, the tactical pass aims to use better moves than the one provided by the strategic pass, to still accomplish those roles but with better positioning.

Proper ship micromanagement makes a huge difference in Halite. Superior ship numbers in combat snowballs hard, avoiding useless fights lets ships focus on more important tasks and evading defenses can let ships harass better, etc. Given the combat mechanics it seemed really difficult to come up with a decision making structure similar to the other parts. Maybe some sort of tree search could help figure out the best movements, but the very large number of possible moves even with pruning, especially given the quantities of ships involved, did not seem feasible in any straightforward fashion.

So I started looking at a simpler version of the problem: if enemy movements are known in advance, how to position ships to take the most advantage of it? This was more or less my first iteration:

Assume all enemy ships move towards their closest enemy ship at full thrust.
Simulate all ship movements and attacks, including the instantaneous attacks of the next turn.
Evaluate the score of this simulation by summing all allied ships health, and subtracting the sum of all enemy ships health.

This gives a baseline for the current plan given by the strategic pass, which can then be iterated on to find moves giving a better score. I used hill climbing to refine the current plan:

Take an allied undocked ship at random.
Replace its action with a move of random angle and thrust.
Perform the simulation again and evaluate the new score. If it's better than the old one, replace that ship's action with the random one.
Repeat until timeout.

With this iterative search the bot's behavior was already massively improved. Ships would avoid losing battles by moving out of the way, or commit together to ensure the battle result would get even better. Emergent behavior, such as low health ships attempting to ram into an enemy, would happen as long the net health result is better. Afterwards it was all about adding various improvements to this basic idea.

Enemy prediction

So far it's been assumed the enemy ship always move straight towards the closest target. It's obviously not something that happens very often, so moves can be made that are very weak to other enemy responses. I experimented quite a lot with various probability models and ended up settling with this one. A set of 19 global enemy responses is precomputed, and the evaluation of each resulting simulation is summed up to make a single score. In other words, each tested move results in 19 different simulations and making sure the score is better on average.

The composition of these 19 responses is:

1 for staying still.
2 for moving towards the closest enemy at thrusts [2,5].
16 for 22 degrees iterative rotations, starting from the initial direction of the closest enemy, at thrust 7.
The direction to closest enemy with thrust 7 has a weight of 2, the others 1.

For enemy ship movement, I also disabled ship-ship collisions if the two ships involved belong to the enemy, in order to spend less time solving it and overestimate the enemy's ability to group up together to get a more conservative prediction. Ship-planet collisions however are always maintained.

Grouping

Unlike many other top bots, I did not have any specific code forcing ships to move closer together, as it created weird or undesirable behaviors. However I still used a clustering method to help the hill climbing get out of some traps. One problem I noticed is when a smaller mass of ships is engaged in combat with a bigger one, the hill climbing fails to move them individually out of danger, because the evaluation function only sees the short term impact. It thinks keeping them in the action will minimize the losses, failing to see the overall battle is lost.

So in addition to trying random actions on individual ships, it also tries applying a random action to all ships of a group and examines the result in the same way. It works rather well even if the grouping method is very naive:

For each allied ship not filtered out:
- Iterate through all groups. If it is closer than 3 units away to one ship of that group, add it to that group, up to a maximum of 8.
- If no group is found, create a new one with this ship.
Eliminate all groups with only 1 ship.

State evaluation

There are many issues with the basic evaluation to address to make the most out of the tactical search.

One big problem with just summing up allied health and comparing against enemy health is it will not mind trading ships for no reason, as long as there is no net loss. Since numeric superiority is so effective, it is important to prioritize ship preservation and only trade when it makes sense. Health multipliers are used to express these intents:

In general, try looking for at least 2:1 trades, which is achieved by multiplying ally health by 1.9. Ships then become a lot more careful, but also allow to build up a massive army to attack with after.
Allied docked ships get a 2.0 multipler and a raw 64 bonus, to make ships on defense stick to docked ships, as distributing enemy damage will make them survive longer, and possibly produce more ships.
Allied ships on attack role get a 3.9 multiplier, because they cause a lot more indirect damage (e.g. locking enemy ships on defense) just by being there, so survival is key.
When on defense against nearby enemy ships, enemies get a 1.5 multiplier to look for more aggressive trades and eliminate threats quicker.

Another problem is ships entirely stop caring about the strategic plan if combat happens. The original target position must be taken into account to be as close to it as possible, but not at the expense of bad combat. So 1 unit of distance to the original target position is worth -1 point. One of the most noticeable impacts is ships attacking enemy docked ships now often move and dodge enemies, like rain on an umbrella. If the ship is moving to colonize a planet, each distance unit is worth -5 points instead, so it does not get too distracted by combat.

One last improvement to the evaluation is to use a non-linear function for health values, separating it into different tiers.

Health curve

static const int healthTiers[4] = { 160, 160 + 128, 160 + 128 + 96, 160 + 128 + 96 + 64 };
float health = (float)(ship.health + healthTiers[ship.health / 64]);
health /= (255 + healthTiers[3]);
health *= 255;

Overall, the goal is to give more importance to the number of single hits a ship can take before destruction, as its health greatly affect how it can be used. For instance, it's very good to trade a full hit between a max hp ship and an enemy ship almost destroyed. With the same reasoning, it's also good to distribute more evenly the health between different ships, as each will survive longer and therefore output more damage in a fight.

Also, a different evaluation function is used when the survival mode is engaged, where only allied ships health is considered, to maximize survivability.

Optimization and other improvements

Because of the sheer volume of simulations necessary to obtain a good tactical pass, I spent some time profiling and optimizing the code to give good results in most situations. The biggest improvement by far was to limit the O(n^2) collision detection, which easily eat up the majority of time spent, by precomputing all pairs of entities that can potentially interact within the next turn, and loop through these pairs instead of (all ships x all ships).

Another easy improvement was to remove any ships not within a distance of 25 units from an enemy, as they cannot possibly have an impact over scoring.

Execution speed can still be an issue towards the mid-game of 4p games when the number of ships gets in the hundreds. Before trying random actions, moves from 0-360 degrees at full thrust for each ship are tried first: one loop for 45 degrees increments starting at 0, and another for 45 degrees increments starting at 22. That way bailing out due to timeout still gives good enough results.

Instead of selecting a random ship on each iteration, it is better to iterate through all ships and groups evenly to avoid the RNG being sometimes whimsical.

Instead of selecting a random thrust, it is also better to weigh it towards the circle area it corresponds to, that way less time is spent on thrust 1 and more on thrust 7. Thrust 0 can be tested once at the beginning. In other words:

int thrust = rnd.getInt(49);
if (thrust < 1) thrust = 1;
else if (thrust < 4) thrust = 2;
else if (thrust < 9) thrust = 3;
else if (thrust < 16) thrust = 4;
else if (thrust < 25) thrust = 5;
else if (thrust < 36) thrust = 6;
else thrust = 7;
int angle = rnd.getInt(360);
return move(thrust, angle);

Other thoughts

Navigation was a hot topic in chat but a good solution is rather trivial. The given method in starter kits is good enough, my only improvements were to force 1 degree increments, use the direction vector out of integer angle and thrust to have the real movement, and check alternately between angles x and -x to favor the shortest deviation. Collision detection is also trivial to solve by storing the direction vector every time a move is issued to a ship, and doing a segment-segment intersection check between the desired direction vector with the one of every other ship. If it fails, then keep iterating on the angles. There was no point complexifying this further as the tactical pass would mess up most of it anyway.

I have tried many times to come up with some kind of search for the strategic pass, to no avail. The main difficulty I encountered was to find a cheap but accurate model for predicting ship combat outcomes. Analyzing most of the resulting long term plans would come up with a lot of problems and over/underestimating combat situations. I am still a little annoyed about not making anything work in that area.

Local testing against previous versions was helpful in the beginning, but the exercise became increasingly inaccurate and pointless over time. I often had versions performing much better online while still performing poorly against the previous version. I am still hoping to eventually find a way to easily maintain enough diversity in the local pool to give more accurate results, as waiting for online results is tedious and slow.

Workflow and tools

There is tremendous value to be had in having a good development workflow and tools, I think this is often way too underestimated in competitions like these. Even with 3 months available, maximizing the value of the time spent pays off.

First, if you are not already using an IDE and a source control, please take the time to look into one, it's a ridiculously good payoff for the time invested. For Windows users I personally recommend using the free version of Visual Studio for C++ and C#, along with Git for source control.

Despite the excellent Chlorine third party viewer generously provided by fohristiwhirl, I chose early on to write my own replay viewer in order to have better control and integration with my development flow. Some of its features include:

Automatic and frame-by-frame playback with hotkeys. No interpolation was used to stay true to the game state.
Intuitive zoom and panning with detailed display of planets and ships, showing id, health and attack events.
Breakdown of ships for each player, including current production and total ships produced.
Launch my bot with the current game state. This feature is most useful to step through and debug the bot in an IDE to figure out what went wrong. It can also be used to display which move would have been played in the current frame, to compare with another version. Tactical mode can also be disabled to display the initial intent instead.
Live automated display of recent games and leaderboard.
Open a replay in the viewer by double-clicking on a file or recent game.
Launch a local game between my current bot and any of its previous versions.
Contains a customized build of the Halite executable allowing live display of the game while it is being played.

I also wrote a command-line tool to help with other tasks, such as:

Perform a benchmark between current bot and a previous bot version for comparison in 2p and 4p games. (only used early in the competition)
Download all replays of a given bot's version.
Decompress a replay to json format for debugging purposes.
Convert a replay to a format compatible with bot input parsing, in order to validate the accuracy of the game simulation.
Build a string with the mu progression of a submission, ready to be pasted in Google Sheets, in order to better visualize the performance of a bot.
Statistics breakdown of the performance of a bot version online. By far the #1 way to evaluate the impact of changes!
Analyze the final ranking of 4p games vs the tiebreaker. I used this as a metric to determine the performance of the survival strategy.
Analyze the number of docked ships destroyed shortly after docking. I used this as a metric to determine the performance of the planet safety algorithm.

Example of a progression chart:

Bot version chart

Example of a performance breakdown:

2p stats: 204-23 (227 games) 90-10%
mellendo v57     26-1        (1-26)
mlomb v36        23-2        (2-23)
Gadziferoth v67  22-2        (2-22)
FakePsyho v152   19-4        (4-19)
shummie v568     11-2        (2-11)
shummie v566     11-1        (1-11)
shummie v574     6-4         (4-6)
ipost v16        10-0        (0-10)
zxqfl v73        7-1         (1-7)
ewirkerman v78   6-0         (0-6)
4p stats: 179-47-39-21 (286 games) 63-16-14-7%
FakePsyho v152   62-26-14-10 (37-34-19-22)
Gadziferoth v67  61-18-12-3  (10-33-36-15)
mlomb v36        51-17-12-10 (3-22-29-36)
mellendo v57     41-9-12-9   (11-23-15-22)
zxqfl v73        34-12-7-5   (4-13-19-22)
shummie v568     34-6-5-4    (6-14-15-14)
shummie v574     16-7-3-4    (4-9-9-8)
ipost v16        16-8-4-1    (2-5-9-13)
ewirkerman v78   21-2-3-1    (1-8-10-8)
shummie v567     13-3-7-1    (5-3-7-9)

Halite feedback

The game overall was very well designed, striking a fine balance between complexity and depth. Part of me wishes maybe one or two more viable mechanics were available, to give a bit more variety in the possible strategies, but it's a tough call. Less is more and all that, but still.

There were still some significant issues in the design however. Rushing prevents too many games from ever starting, which brings a lot of volatility and annoying edge cases to deal with, not to mention it's a lot easier to write a good rush bot than defending well against one. I think the best way to solve this issue would be to make sure player spawns are always far enough. At the very least, the platform should take ties as a valid game result instead of randomly assigning a winner. This would have allowed a viable 2p rush defense strategy by stalling out and not engaging in order to tie and minimize the ranking loss for both players, not to mention eliminate more volatility in rankings.

I also have no idea why 2p maps are symmetric but not 4p ones, as it creates a lot of frustrating imbalance to deal with. Too often, one bot would start with one planet nearby and the other one with three, and given equivalent skill, the game was over before it even began. For any competitive game designer reading this, please always make sure every player starts on even footing no matter what.

The Halite platform is just fantastic overall, and having it open sourced to everyone is incredibly satisfying. I must underline how invaluable it was to have the backend API and replay format available, making so many tools and analysis possible. My improvement suggestion would be to add an API to play out a single match against chosen opponents on a specific map seed, in order to have a much easier time ironing out edge cases against specific bots and strategies (namely rush defense).

Conclusion

I cannot thank the Halite staff and Two Sigma enough, along with volunteer janzert, for the amazing work they have put into this competition. They have been incredibly active and responsive to answer everyone's questions and solve the occasional problems, and the overall quality of the competition just blows me away. The Halite community has also been very cool and friendly, it was a lot of fun interacting through the chat, so thank you all for being a part of that.

I am still on the fence about publishing the source code of my bot. I believe I have already explained most of all the interesting stuff in this article, and the rest is mainly a personal mess of a codebase that should definitely not be used as an example, even for prototyping. If I ever change my mind though I will update with a link here.

Thanks for reading this article through to the end. If you have any feedback, questions or comments, do not hesitate to reach me in the Halite's Discord or via PM on Discord (reCurse#5226), I will answer as best as I can!

– reCurse