Advent of Code Benchmarking

Post thumbnail

For a few years now I’ve been enjoying Eric Wastl’s Advent of Code. For those unaware, each year since 2015 Advent of Code provides a 2-part coding challenge every day from December 1st to December 25th.

In previous years, Myself and Chris have been fairly informally trying to see who was able to produce the fastest code (Me in PHP, Chris in Python). In the final week of last year to assist with this, we both made our repos run in Docker and produce time output for each day.

This allowed us to run each other’s code locally to compare fairly without needing to install the other’s dev environment, and made the testing a bit fairer as it was no longer dependant on who had the faster CPU when running their own solution. For the rest of the year this was fine and we carried on as normal. As we got to the end I remarked it would be fun to have a web interface that automatically dealt with it and showed us the scores, but there was obviously no point in doing that once the year was over. Maybe in a future year…

Fast forward to this year. Myself and Chris (and ChrisN) coded up our Day 1 solutions as normal and then some other friends started doing it for the first time. I remembered my plans from the previous year and suggested everyone should also docker-ify their repos… and so they agreed

Now, I’m not one who is lacking in side-projects, but with everyone making their code able to run with a reasonably-similar docker interface, and the first couple of days not yet fully scratching the coding-itch, I set about writing what I now call AoCBench.

The idea was simple:

  • Check out (or update) code
  • Build docker container
  • Run each day multiple times and store time output
  • Show fastest time for each person/day in a table.

And the initial version did exactly that. So I fired up an LXC container on one of my servers and set it off to start running benchmarks and things were good.

AoCBench Main Page

Pretty quickly the first problem became obvious - it was running everything every time which as I added more people really slowed things down, so the next stage was to make it only run when code changed.

In the initial version, the fastest time from 10 runs was the time that was used for the benchmark. But some solutions had wildly-varying times and sometimes “got lucky” with a fast run which unfairly skewed the results. We tried using mean times. Then we tried running the benchmarks more often to see if this resulted in more-alike times. I even tried making it ignore the top-5 slowest times and then taking the mean of the rest. These still didn’t really result in a fair result as there was still a lot of variance. Eventually we all agreed that the median time was probably the fairest given the variance in some of the solutions.

But this irked me somewhat, there was no obvious reason some of the solutions should be so variant.

It seemed like it was mostly the PHP solutions that had the variance, even after switching my container to alpine (which did result in quite a speed improvement over the non-alpine one) I was still seeing variance.

I was beginning to wonder if the host node was too busy. It didn’t look too busy, but it seemed like the only explanation. Moving the benchmarking container to a different host node (that was otherwise empty) seemed to confirm this somewhat. After doing that (and moving it back) I looked some more at the host node. I found an errant fail2ban process sitting using 200% CPU, and killing this did make some improvement (Though the node has 24 cores, so this shouldn’t really have mattered too much. If it wasn’t for AoCBench I wouldn’t even have noticed that!). But the variance remained, so I just let it be. Somewhat irked, but oh well.

AoCBench Matrix Page

We spent the next few evenings all optimising our solutions some more, vying for the fastest code. To level the playing feed some more, I even started feeding everyone the same input to counter the fact that some inputs were just fundamentally quicker than others. After ensuring that everyone was using the same output, the next step was to ensure that everyone gave the right answer and removing them from the table if they didn’t (This caught out a few “optimisations” that optimised away the right answer by mistake!). I also added support for running each solution against everyone else’s input files and displaying this in a grid to ensure that everyone was working for all inputs not just their own (or the normalised input that was being fed to them all).

After all this, the variance problem was still nagging away. One day in particular resulted in huge variances in some solutions (from less than 1s up to more than 15s some times). Something wasn’t right.

I’d already ruled out CPU usage from being at fault because the CPU just wasn’t being taxed. I’d added a sleep-delay between each run of the code in case the host node scheduler was penalising us for using a lot of CPU in quick succession. I’d even tried running all the containers from a tmpfs RAM disk in case the delay was being caused reading in the input data, but nothing seemed to help.

With my own solution, I was able to reproduce the variance on my own local machine, so it wasn’t just the chosen host node at fault. But why did it work so much better with no variance on the idle host node? And what made the code for this day so much worse than the surrounding days?

I began to wonder if it was memory related. Neither the host node or my local machine was particularly starved for memory, but I’d ruled out CPU and DISK I/O at this stage. I changed my code for Day 3 to use SplFixedArray and pre-allocated the whole array at start up before then interacting with it. And suddenly the variance was all but gone. The new solution was slow-as-heck comparatively, but there was no more variance!

So now that I knew what the problem was (Presumably the memory on the busy host node and my local machine is quite fragmented) I wondered how to fix it. Pre-allocating memory in code wasn’t an option with PHP so I couldn’t do that, and I also couldn’t pre-reserve a block of memory within each Docker container before running the solutions. But I could change the benchmarking container from running as an LXC Container to a full KVM VM. That would give me a reserved block of memory that wasn’t being fragmented by the host node and the other containers. Would this solve the problem?

Yes. It did. The extreme-variance went away entirely, without needing any changes to any code. I re-ran all the benchmarks for every person on every day and the levels of variance were within acceptable range for each one.

AoCBench Podium Mode

The next major change came about after Chris got annoyed by python (even under pypy) being unable to compete with the speed improvements that PHP7 has made, and switched to using Nim. Suddenly most of the competition was gone. The compiled code wins every time. every. time. (Obviously). So Podium Mode was added to allow for competing for the top 3 spaces on each day.

Finally, after a lot of confusion around implementations for Day 7 and how some inputs behaved differently than others in different ways in different code, the input matrix code was extended to allow feeding custom inputs to solutions to weed out miss-assumptions and see how they respond to input that isn’t quite so carefully crafted.

If anyone wants to follow along, I have AoCBench running here - and I have also documented here the requirements for making a repo AoCBench compatible. The code for AoCBench is fully open source under the MIT License and available on GitHub

Happy Advent of Code all!

DNS Hosting - Part 3: Putting it all together

Post thumbnail

In my previous posts I discussed the history leading up to, and the eventual rewrite of my DNS hosting solution. So this post will (finally) talk briefly about how it all runs in production on MyDNSHost.

Shortly before the whole rewrite I’d found myself playing around a bit with Docker for another project, so I decided early on that I was going to make use of Docker for the main bulk of the setup to allow me to not need to worry about incompatibilities between different parts of the stack that needed different versions of things, and to update different bits at different times.

The system is split up into a number of containers (and could probably be split up into more).

To start with, I had the following containers:

  • API Container - Deals with all the backend interactions
  • WEB Container - Runs the main frontend that people see. Interacts with the API Container to actually do anything.
  • DB Container - Holds all the data used by the API
  • BIND Container - Runs an instance of bind to handle DNSSEC signing and distributing DNS Zones to the public-facing servers.
  • CRON Container - This container runs a bunch of maintenance scripts to keep things tidy and initiate DNSSEC signing etc.

The tasks in the CRON container could probably be split up more, but for now I’m ok with having them in 1 container.

This worked well, however I found some annoyances when redeploying the API or WEB containers causing me to be logged out from the frontend, so another container was soon added:

  • MEMCACHED Container - Stores session data from the API and FRONTEND containers to allow for horizontal scaling and restarting of containers.

In the first instance, the API Container was also responsible for interactions with the BIND container. It would generate zone files on-demand when users made changes, and then poke BIND to load them. However this was eventually split out further, and another 3 containers were added:

  • GEARMAN Container - Runs an instance of Gearman for the API container to push jobs to.
  • REDIS Container - Holds the job data for GEARMAN.
  • WORKER Container - Runs a bunch of worker scripts to do the tasks the API Container previously did for generating/updating zone files and pushing to BIND.

Splitting these tasks out into the WORKER container made the frontend feel faster as it no longer needed to wait for things to happen and could just fire the jobs off into GEARMAN and let it worry about them. I also get some extra logging from this as the scripts can be a lot more verbose. In addition, if a worker can’t handle a job it can be rescheduled to try again and the workers can (in theory) be scaled out horizontally a bit more if needed.

There was some initial challenges with this - the main one being around how the database interaction worked, as the workers would fail after periods of inactivity and then get auto restarted and work immediately. This turned out to be mainly due to how I’d pulled out the code from the API into the workers. Whereas the scripts in API run using the traditional method where the script gets called and does it’s thing (including setup) then dies, the WORKER scripts were long-term processes, so the DB connections were eventually timing out and the code was not designed to handle this.

Finally, more recently I added statistical information about domains and servers, which required another 2 containers:

  • INFLUXDB Container - Runs InfluxDB to store time-series data and provide a nice way to query it for graphing.
  • CHRONOGRAF Container - Runs Chronograf to allow me to easily pull out data from INFLUXDB for testing.

That’s quite a few containers to manage. To actually manage running them, I make use of Docker-Compose primarily (to set up the various networks, volumes, containers) etc. This works well for the most part, but there are a few limitations around how it deals with restarting containers that cause fairly substantial downtime with upgrading WEB or API. To get around this I wrote a small bit of orchestration scripting that uses docker-compose to scale the WEB and API containers up to 2 (Letting docker-compose do the actual creation of the new container), then manually kills off the older container and then scales them back down to 1. This seems to behave well.

So with all these containers hanging around, I needed a way to deal with exposing them to the web, and automating the process of ensuring they had SSL Certificates (using Let’s Encrypt). Fortunately Chris Smith has already solved this problem for the most part in a way that worked for what I needed. In a blog post he describes a set of docker containers he created that automatically runs nginx to proxy towards other internal containers and obtain appropriate SSL certificates using DNS challenges. For the most part all that was required was running this and adding some labels to my existing containers and that was that…

Except this didn’t quite work initially, as I couldn’t do the required DNS challenges unless I hosted my DNS somewhere else, so I ended up adding support for HTTP Challenges and then I was able to use this without needing to host DNS elsewhere. (And in return Chris has added support for using MyDNSHost for the DNS Challenges, so it’s a win-win). My orchestration script also handles setting up and running the automatic nginx proxy containers.

This brings me to the public-facing DNS Servers. These are currently the only bit not running in Docker (though they could). These run on some standard Ubuntu 16.04 VMs with a small setup script that installs bind and an extra service to handle automatically adding/removing zones based on a special “catalog zone” due to the versions of bind currently in use not yet supporting them natively. The transferring of zones between the frontend and the public servers is done using standard DNS Notify and AXFR. DNSSEC is handled by the backend server pre-signing the zones before sending them to the public servers, which never see the signing keys.

By splitting jobs up this way, in theory it should be possible in future (if needed) to move away from BIND to alternatives (such as PowerDNS or so).

As well as the public service that I’m running, all of the code involved (All the containers and all the Orchestration) is available on Github under the MIT License. Documentation is a little light (read: pretty non-existent) but it’s all there for anyone else to use/improve/etc.