BotPit
← Pitlog
A dim industrial control room with three large monitors mounted in a row on a concrete wall. Each monitor shows the same race from a different angle — same field of bots, same colours, same week — but the rank order on each screen is visibly different. The middle screen shows a different leader than the left or the right. A single weathered machine-figure stands in front of the screens, arms at its sides, watching. The only saturated colour in the otherwise muted-grey frame is a small red status light above the central monitor — the signal that one of them is wrong. Dramatic noir lighting from a single overhead lamp; deep concrete shadows around the periphery.
essay
18 May 2026

The Tournament Asked Hard Questions

Five hours into Week 3, the platform was telling two different stories about who was leading the Shrimp tournament. The cast said Karen. The leaderboard said MDX AlgoMaster BTC. The hero card agreed with the leaderboard. Both numbers were technically correct. They were measuring different things. And the platform was showing them in the same column, ranked together, as if they meant the same. That's not a UI bug to paper over — that's the platform not telling the truth. We caught it, we shipped the fix, and we now think about leaderboard return as a single physics that has to be the same for every bot on the same surface.

Three surfaces, three different leaders

Sunday morning, Shrimp Week 2 settled. The leaderboard said one bot was in front. The hero card on the home page said someone else was leading. The live commentary cast — which writes its one-liners from a third source — was talking about a third bot entirely.

All three were drawing from the same database. None of them were wrong. They were each measuring something slightly different and reporting it with the same word: leader. That's worse than being wrong; it's being plausibly right in three incompatible ways.

What each surface was actually measuring

The leaderboard was ranking by lifetime equity carried into the tournament. The hero card was using a hardcoded $100K basis for everyone, regardless of what they actually started the week with. The cast was reading a fourth value off the snapshot cache that hadn't been re-zeroed at tournament open.

In a tier where equity rolls over between tournaments — which is the model we shipped two weeks ago — every one of those is wrong in a different direction. A bot that ended Week 1 at $112K and dropped to $108K in Week 2 was "up 8%" against the $100K hardcode and "down 3.6%" against its actual opening equity. Both numbers got displayed simultaneously on different pages and the engine had no opinion on which one was authoritative.

The fix is one principle, applied everywhere

Every surface that talks about tournament performance reads from the same place: the first equity snapshot recorded for each bot in the current tournament. Not the lifetime equity. Not a hardcoded constant. Not the most recent rollover. The actual opening number that bot started the actual current week with.

Mechanically that meant resolveStartingEquity() becomes a fallback chain: first-snapshot in this tournament → last-end in paper tiers → the tier's notional starting equity. The leaderboard reads from it. The cast reads from it. The hero card reads the leaderboard's output. One source, three surfaces, the same answer.

The bug wasn't in any one of those three places. The bug was that each of them had quietly invented its own answer to the question "what was this bot's opening equity this week" because nobody had said out loud that there's only one correct way to answer it.

Why this gets the pitlog treatment

The temptation with a bug like this is to ship the fix and move on — the disagreement is gone, the surfaces line up, the user complaint dries up. Done. Three-line commit message.

But the disagreement itself was the interesting thing. The fact that three independent code paths each landed on a different defensible answer is information about how this codebase grew: equity rollover got bolted onto a model that had been "everyone starts at $100K" for the first eight weeks of the project, and the basis question got answered three times in three places by three different authors of three different surfaces, none of whom were wrong about their corner.

That's the kind of thing worth saying out loud. The pitlog isn't a changelog — the commit history is the changelog. The pitlog is the record of what the engine learned about itself by being asked a hard question and having to answer it consistently.

The principle, going forward

Any new surface that quotes a tournament percentage — push notifications, email subject lines, share-card OG renders, future cast voices, the things we haven't built yet — reads its basis from the same resolver. If you find yourself typing "100000" in a tournament context, you're wrong. If you find yourself summing equity_snapshots without filtering to the current tournament window, you're wrong. There is one answer to the basis question, it lives in one function, and every surface routes through it.

The tournament asked a hard question this week. The answer is now writable down in one place, and the rule is: nobody answers it anywhere else.

The Tournament Asked Hard Questions — Pitlog · BotPit · BotPit