The Field — Darts Cricket Research

Reproduce before claiming

Darts cricket is played on seven targets — the numbers 15 through 20 and the bullseye. Three marks close a target; once you have closed a number your opponent hasn’t, further hits on it score points at face value. You win by closing all seven targets while holding at least as many points as your opponent. Every strategic question in the game reduces to one recurring decision, made dart by dart: aim at a target to close it, aim at a closed target to score, or aim at a target the opponent wants so they can’t use it.

There is a published playbook for this decision. Frongello’s catalog of seventeen strategies (S1–S17) is the closest thing computer cricket has to a classical repertoire, and it comes with verdicts attached: which threshold to use, which mechanics help, which habits to avoid. A research project that starts by inventing strategy number eighteen is building on ground it has never tested. So this project started somewhere less glamorous: re-implement the full catalog, put every strategy in the same simulator under the same conditions, count the games, and see whether the published verdicts survive.

That choice sets the epistemics for everything that follows on this site. Every claim comes with a number; every number comes with a sample size; every champion claim is scoped to the field it was measured in and stamped with the date the record was frozen. Where a published verdict failed to reproduce, this site says so — the first casualty appears at the end of this chapter.

Seventeen strategies, three switches

The catalog is more compact than it looks. All seventeen strategies are combinations of three parameters, applied in a fixed priority order on every dart.

Lead threshold — keep scoring until your point lead exceeds a multiple (0×, 3×, 6×, or 9×) of the highest open number’s value, then switch to covering: closing the targets your opponent could score on. S1 is the degenerate case — no threshold at all, it simply closes 20 down to bull and only scores after everything is shut.
Extra darts — if the target you are closing can’t be finished with the darts left in this turn, redirect the leftovers to scoring instead of banking partial progress.
Chase — before anything else, close the highest-value target the opponent has closed and you haven’t, cutting off their scoring lane at the cost of following their agenda instead of yours.

The 17 Frongello strategies defined by their three parameters: lead threshold multiplier, extra-darts redirect, and chase priority. Definitional, from the project’s implementation of the catalog.
Family	Lead threshold	Extra darts	Chase
S1	none — always close, 20 → bull	no	no
S2–S5	0× / 3× / 6× / 9×	no	no
S6–S9	0× / 3× / 6× / 9×	yes	no
S10–S13	0× / 3× / 6× / 9×	no	yes
S14–S17	0× / 3× / 6× / 9×	yes	yes

The catalog’s own headline verdicts, the ones this project set out to test: S2 — cover the moment you hold any lead — is the strongest of the seventeen, and chasing is generally suboptimal, because the chaser closes targets in the opponent’s preferred order rather than their own. Hold on to that second verdict. It reproduces cleanly in this chapter’s tournament — and comes apart six chapters from now.

Alongside the seventeen, the project added twelve experimental variants (E1–E12) that extend the framework with mechanics the catalog never explored: modified closing orders, hit-type selection (when one mark is enough, aim the reliable single, not the triple), phase-dependent thresholds, and combinations of these. They are numbered, not named; most of them matter here only as field depth. One of them turns out to matter a great deal.

The measurement machine: same conditions, counted games

A tournament result is only as good as its controls, so the method is worth a paragraph before any standings are. Skill is modeled by MPR (marks per round, the standard cricket skill metric): each simulated player draws dart outcomes — triple, double, single, miss — from a calibrated distribution, and the benchmark spans 11 skill levels from MPR 0.8 (a novice who misses most throws) to 5.6 (a professional for whom triples are the expectation). Every strategy plays every other in a full round robin at 20,000 games per matchup, with equal skill on both sides, at every level.

Sample sizes are chosen so the error bars are known, not hoped for. At these game counts, the differences this chapter reports — whole percentage points — are far outside noise; the project’s later instruments carry the same discipline explicitly, quoting ±0.8pp 95% Monte-Carlo intervals on 4,000-game cells, ±1.3pp on 1,500-game cells, and ±0.44pp on 50,000-game matches. Comparisons are paired wherever possible: a candidate and its baseline are benched on identical random seeds, so a difference in win rate reflects a difference in strategy, not a difference in luck — and, after this project’s one great embarrassment (next chapter), no champion was kept without reproducing under a fresh seed.

The site’s standing rule

Every claim carries its number, every number carries its sample size, and every ranking names the field it was measured in. A win rate without an n is a rumor; a champion without a field is an advertisement.

The classical champion: E12

Run the machine over the classical field — the 17 catalog strategies, the 12 experimental variants, and one strategy of the project’s own that the next chapter will deal with — and a clear order emerges. As of July 2, 2026, the best classic bot in the project’s record is E12, ranked #1 among the 30 pre-lineage strategies. Its rule is almost embarrassingly simple: play S2, but when you are behind with nothing to score on, check whether any target the opponent has closed sits exactly one mark from being closed on your side — and if so, finish it with a reliable single before opening anything new. One low-risk dart permanently shuts an income lane the opponent is actively using; opening a fresh target costs three marks while that lane keeps bleeding.

In the frozen 31-strategy skill sweep, E12 holds #2 at all 11 skill levels, with an average win rate against the field between 57.8% and 59.5% (20,000 games/matchup). The catalog’s first verdict reproduces alongside it: S2 is the strongest classic S-bot at every one of the 11 levels (57.4–59.2%). The #1 slot at every level belongs to a strategy this project invented later — The Shape Reader (X188), whose average win rate rises from 59.0% at MPR 0.8 to 68.0% at MPR 5.6 and whose story is chapter three.

Standings of the classical field in the frozen 31-strategy skill sweep: 20,000 games per matchup, equal skill both sides, 11 MPR levels from 0.8 to 5.6. Win-rate cells show the range across the 11 levels.
Strategy	Standing in the sweep	Avg WR vs the field (range over 11 levels)
The Shape Reader (X188) — ch. 3	#1 at all 11 levels	59.0% → 68.0%
E12 (finish opponent-closed)	#2 at all 11 levels	57.8–59.5%
S2 (cover on any lead)	best classic S-bot at every level	57.4–59.2%
PS (Phase Switch) — ch. 2	#8–#10 at every level	—

Heat shading encodes average win rate on the same scale used throughout the story wing (59% pale → 68% deep); The Shape Reader’s cell spans the whole ramp and is left unshaded. Full per-level data ships with the legacy tournament files.

Key finding

As of July 2, 2026, E12 is the classical champion: #1 of the 30 pre-lineage strategies, #2 at all 11 skill levels of the 31-strategy sweep at 57.8–59.5% average win rate (20,000 games/matchup). That ranking holds within this field — the sweep predates the project’s clean-slate and reinforcement-learning artifacts, which beat everything in it.

The scope qualifier in that box is not boilerplate. An earlier version of this site claimed its sweep champion “never loses a head-to-head” — a claim that was true inside the 31-strategy field and false the moment stronger artifacts existed to test it against. The corrected habit — name the field, or don’t make the claim — is applied to every ranking on this site, starting with E12’s.

A verdict to hold loosely: “never chase”

One classical verdict deserves a flag before the story moves on. Within this tournament, the catalog’s advice against chasing looks vindicated: the strongest classic S-bot at every level is S2, which never chases, and no chase variant displaces it anywhere in the sweep. A reasonable reader would file “never chase” as reproduced and settled.

It isn’t. When the project later put all of its artifacts — classics, invented strategies, and trained policies — onto a single Elo ladder with a far more varied field, the S-family order inverted: the chase variants S14 and S10 rate at the top of the family (Elo 1037 and 1032), above S2 at 1014. “Never chase” was an artifact of the catalog field — a rule that held among seventeen close cousins and failed against a wider world. How that inversion happens, and what it says about measuring strategies only against their own relatives, is resolved in The Rerun.

Preview of the S-family on the unified Elo ladder (27 artifacts, Bradley–Terry fit, anchor S1 = 1000), as of July 2, 2026. Chase variants S14 and S10 rate above S2.
Classic bot	Chase?	Unified ladder Elo ±SE	Games
S14	yes	1036.6 ±1.6	94,000
S10	yes	1031.7 ±1.6	94,000
S2	no	1013.6 ±1.6	94,000
S1 (ladder anchor)	no	1000.0	88,000
S6	no	949.5 ±1.7	94,000

The S-family on the unified 27-artifact Elo ladder (Bradley–Terry fit, S1 anchored at 1000), as of July 2, 2026. S1, the pure closer, and S6, an extra-darts variant, bracket the family’s bottom. Ladder method and full standings: The Rerun.

What the field hands the rest of the story

This chapter’s output is not a trophy; it is a calibrated baseline. E12 became the seed of nearly everything that follows: the strategy-invention lineage that eventually produced The Shape Reader began as a modification of E12, and the reinforcement-learning arc chose E12 as the teacher for its first imitation policy. The tournament machinery — MPR-graded skill, round robins with counted games, paired seeds — became the project’s first frozen instrument, and its habit of scoping every claim became the house style.

The frozen record is candid about how far the classical champion’s crown travels. On the unified ladder E12 sits mid-table at Elo 1018 ±1.6 (94,000 games), and when the project later trained dedicated adversaries against its leading artifacts, E12 proved the most exploitable of them all — a best-response policy beats it 82.1% of the time (2,000 games, greedy). None of that diminishes the reproduction work; it is the point of it. You can only measure how far the later chapters climb if the ground floor is surveyed first.

The survey also set a trap, and the project walked straight into it. The same machinery that crowned E12 crowned the project’s first original strategy as a near-champion at nine of eleven skill levels — a result that was an artifact of two simulator bugs. That story, and the replication discipline it forced, is next.