How It Works — OGILVIE Baseball Projections

01 Overview & Design Goals

RoboNiner is a multi-layer career projection system covering MLB players and all levels of the minor leagues. It generates Y+0 through Y+18 (or until career-end) projections for every hitter and pitcher in the affiliated minor league system.

MLBAAAAAA+ARookie

Design Goals

Evidence-driven, not assumption-driven. Every aging curve, MLE factor, and regression weight is derived from empirical data or published research — not intuition.

No survivorship bias. Empirical MLEs are capped at population-level bounds to prevent called-up players from inflating minor league translation factors.

Individual, not population. Comp-driven aging means each player's trajectory comes from how similar historical players actually aged, not the average player.

Free data only. All sources are free and public: MLB Stats API, Baseball Savant, Retrosheet, Lahman Database.

02 How Other Systems Work

Understanding the major public systems clarifies RoboNiner's design choices.

PECOTA (Baseball Prospectus)

Method: Comparables-based. Finds 25 historical players most similar to the target via Mahalanobis distance (age, level, skill rates, body type, handedness), then blends their actual career trajectories. Produces percentile bands (10th/90th).

Key weakness: MLE methodology not publicly documented; selection bias risk acknowledged but uncorrected. Published similarity weights are proprietary.

ZiPS (Dan Szymborski, FanGraphs)

Method: Marcel-based (3-year weighted average with regression), modified by Tango/Lichtman aging research. Comps inform aging curves rather than trajectories directly.

Key weakness: Fixed aging curves — a 5'9" contact hitter and a 6'5" slugger get the same aging curve. MLE factors computed from called-up players (selection bias acknowledged but not corrected).

Steamer (Jared Cross, FanGraphs)

Method: Component-based regression with aging. Separately models AVG, OBP, SLG, K%, BB%, BABIP using independent regression weights. Consistently the most accurate single-year system in public evaluations.

Key weakness: Fixed aging curves. Single-year focus — no comprehensive career arc projections. No percentile forecasts.

THE BAT X (Derek Carty)

Method: Neural network / ML approach on Marcel foundation. Most aggressive Statcast integration of any public system.

Key weakness: Black box — methodology not published. Can overfit to recent seasons.

OOPSY (Dave Fleming)

Method: Similar to ZiPS with validated young-for-level adjustments. Key insight: a player 2+ years younger than the average at their level has demonstrated elite talent relative to peers.

Key insight adopted: Young-for-level regression discount is empirically validated and integrated directly into RoboNiner's Marcel+ layer.

03 The 9-Layer Architecture

Projections are built sequentially. Each layer adds information the prior layer lacks.

Raw Stats ↓ [Layer 1] Marcel+ foundation — 3yr weighted avg, stat-specific regression, young-for-level bonus ↓ [Layer 2] Component rate decomposition — K%, BB%, HR/PA, BABIP, LOB% aged independently ↓ [Layer 3] Park factor neutralization — per-component, multi-year smoothed ↓ [Layer 4] MLE translations — minor league → MLB equivalent, selection-bias capped ↓ [Layer 5] Statcast calibration — xwOBA, barrel%, EV, sprint speed, whiff rate ↓ [Layer 6] Fielding integration — OAA + catcher framing, position transitions ↓ [Layer 7] Comp-driven aging curves — individual trajectories from 25 historical comparables ↓ [Layer 8] Comp-driven playing time & attrition — PA arc from comps, career-end from attrition rate ↓ [Layer 9] Percentile forecasts — 10th / 25th / 50th / 75th / 90th ↓ Career arc (Y+0 through retirement)

Layer 1: Marcel+ Foundation

Base: Tom Tango's Marcel system (5/4/3 year weights) extended with stat-specific regression. K% and BB% regress quickly (high year-to-year correlation); BABIP regresses heavily (noisy, environment-sensitive):

Stat	r² (yr-to-yr)	Regression PA
K%	0.82	50 PA
BB%	0.73	100 PA
HR/PA	0.70	150 PA
AVG	0.46	270 PA
BABIP	0.44	280 PA

Young-for-level bonus (from OOPSY research): a player 2+ years younger than the league average at their level has demonstrated elite talent relative to peers. Regression shrinkage is reduced up to 80% for players 3+ years young for their level:

ageRelativeBonus = 1.0 + min(0.80, yearsYoung × 0.28)
averageAges: ROK=19.0, A=21.5, A+=22.5, AA=23.5, AAA=26.0, MLB=28.5

Minor league regression multiplier: minor league stats are noisier than MLB. Regression is amplified by level:

Level	Hitter mult	Pitcher mult
MLB	1.0×	1.0×
AAA	2.0×	2.5×
AA	3.5×	4.0×
A+	4.5×	6.0×
A	5.5×	8.0×
Rookie	7.0×	10.0×

Layer 2: Component Rate Decomposition

Rather than aging AVG/OBP/SLG directly, we decompose to rate components and age each independently. Hitters: K_rate, BB_rate, HR_rate, BABIP, 2B_rate, 3B_rate, SB_rate, HBP_rate. Pitchers: K_rate, BB_rate, HR_rate, H_rate, LOB%.

Why LOB%? ERA fluctuates 15–20% year-to-year due to strand rate. The component model projects rates → FIP → reconstructed ERA using regressed LOB% (~71.5%), which is far more predictive than raw ERA.

04 Minor League Equivalencies (MLEs)

MLEs convert minor league stats to their MLB equivalent. A .310 AVG in AAA is worth roughly .297 in MLB. A .310 AVG in Low-A is worth roughly .240.

How MLEs Are Computed

For each year/level, we collect players who appeared at that level and MLB within the same season (called-up players), then compute the ratio of their rate stats at each level.

⚠ The Selection Bias Problem

Called-up players are the top 5–10% of their level. Their translated stats inflate empirical MLE factors dramatically. An uncorrected empirical factor would project a 22-year-old AA slugger at .640 MLB SLG — a top-3 hitter in baseball. After correction: .433 SLG — a legitimate prospect.

Level	Empirical HR_rate	Population Reality
AAA	1.50×	~0.92×
AA	1.42×	~0.78×
A+	1.42×	~0.68×

Population-Level Caps

We apply maximum multiplier caps reflecting what the average player at each level would translate to:

Level	AVG cap	SLG cap	HR_rate cap
AAA	0.960	0.940	0.920
AA	0.930	0.880	0.780
A+	0.910	0.820	0.680
A	0.880	0.760	0.550
Rookie	0.840	0.680	0.420

Minimum sample thresholds: MLEs require ≥50 PA (hitters) or ≥15 IP (pitchers). Below these, the MLE is null — tiny samples create mathematically nonsensical translations (e.g., 10 PA × ROK discount = 2 adjAB; one double = 1.000 SLG).

05 Park Factors

Park factors are applied at the component level, not just summary stats. Coors Field boosts HR differently from BABIP.

Computation Method

MLB + AAA: PF = (home_stat / home_PA) / (away_stat / away_PA) × 100 using homeAndAway splits from the MLB Stats API. Three-year weighted smoothing (5/4/3) reduces season noise.

AA/A+/A/ROK: Single-regression approximation: PF = (team_stat_rate / league_avg_rate) × 100. Less accurate but adequate at the aggregate level.

Application: parkAdjusted = observed / (1 + (PF - 100) / 200). The /200 denominator (not /100) reflects that players play ~50% of games at home.

Notable Parks (Current Season)

Park	HR PF	Notes
Coors Field (COL)	112+	Highest elevation, enormous outfield
Great American (CIN)	109	Short RF porch
Petco Park (SD)	91	Marine layer, deep CF
Oracle Park (SF)	88	Pitcher's park, McCovey Cove

06 Per-League MLE Factors

Beyond overall level factors, we compute per-league adjustments within each level. The PCL (Pacific Coast League, AAA) is historically hitter-friendly; the IL (International League) runs closer to MLB environments.

getMLEFactor(levelKey, leagueId) first attempts a league-specific factor, then falls back to the level default. A pitcher who spent two years in the hitter-friendly PCL won't have his ERA unfairly inflated when translated to MLB.

League	Level	Character
Pacific Coast League	AAA	Hitter-friendly (desert parks, altitude)
International League	AAA	Pitcher-friendly, closer to MLB run environment
Texas League	AA	Hitter-friendly (heat, wind, small parks)
Eastern League	AA	More pitcher-friendly

07 Aging Model: Comp-Driven v3.0

The problem with fixed aging curves: Population averages predict how the average 27-year-old declines. Variance is enormous — a catcher who developed late plate discipline may still be improving at 30; a speed-dependent hitter may start declining at 25. ZiPS and Steamer apply the same curve to both.

Step 1: Find Historical Comparables

For each projection target, find 25 players from the Lahman database (1955–2025) most similar at the target's current age. Similarity measured via Mahalanobis distance:

Dimension	Weight
Age at entry	High
K_rate, BB_rate	Medium
ISO (isolated power)	Medium
BABIP	Medium
Position group	High
Body type (height/weight)	Medium
Modern era bonus (2000+)	Low

Step 2: Extract Comp Aging Curves

For each comp, trace year-over-year stat changes weighted by: sample size (PA/BF) and comp distance (1 / (0.5 + distance)). Result: per-stat, per-age aging deltas for K_rate, BB_rate, BABIP, HR_rate, ISO, SB_rate.

Step 3: Blend Comp vs. Baseline

if compConfidence >= 0.5:
  delta = comp_delta × blend + baseline_delta × (1 - blend)
  blend = min(0.80, compConfidence)
else:
  delta = baseline_delta  # fall back to research-backed fixed curves

Career endpoints from attrition: At each age, we track what fraction of the comp pool was still active. When fewer than 20% of comps are still playing AND OPS < .550 (or ERA > 6.50), the player retires.

Baseline Aging Parameters (Fallback)

Sources: Tango/Lichtman (The Book, 2006), FanGraphs aging studies, Lichtman's OOPSY research.

Stat	Peak Age	Notes
K_rate	28–29	Rapid improvement 22–24, stabilizes by 28
BB_rate	26–28	Discipline keeps improving through late 20s
HR_rate	27–28	Bat speed peak, then decline
BABIP	25–27	Sprint speed peaks at 23–24; BABIP lags slightly
SB_rate	23–24	Linear speed decline from early 20s

Position-Specific Aging Multipliers

Position	Mult	Rationale
C	1.25×	Highest physical wear
SS	1.05×	Significant defensive wear
CF	1.02×	Speed-dependent
LF/RF	0.95×	Less physical
1B	0.90×	Least position wear
DH	0.85×	No defensive wear

Multipliers ramp from 1.0 at age 22 to full value at 27 (development phase shouldn't apply a wear penalty).

Velocity Aging (Pitchers)

Pitchers begin losing ~0.25 mph/yr of fastball velocity after age 25. Each mph lost reduces K_rate:

veloLost   = (age - max(startAge, 25)) × 0.25  // mph
kRateLoss  = veloLost × 0.0008                  // K/BF reduction per mph

This compounds with comp-driven curves: the curves capture the empirical K decline; velocity aging explains the mechanism.

08 Statcast Integration

Statcast calibrates the Marcel+ foundation with objective physical measurements more predictive than rate stats alone.

Hitter Inputs

Field	Weight	Impact
`xwOBA`	40%	Holistic calibration — best single predictor of offensive value
`barrel_pct`	20%	ISO / HR_rate — optimal EV + launch angle
`hard_hit_pct`	15%	ISO supplement — 95+ mph EV, independent signal
`avg_ev`	25%	BABIP — each mph above 88.5 ≈ +0.003 BABIP
`max_ev`	15%	HR ceiling confirmation
`sprint_speed`	40%	SB_rate + BABIP — each ft/s above 27.0 ≈ +2.5 SB attempts/yr
`ba_diff` (BA−xBA)	25%	BABIP luck correction — positive = lucky, negative = unlucky
`whiff_rate`	—	Direct K predictor (when available)
`chase_rate`	—	K + BB predictor (zone control)

Pitcher Inputs

Field	Weight	Impact
`xwOBA` (against)	—	Holistic H_rate + HR_rate calibration
`xISO_against`	35%	HR predictor — quality of extra-base contact allowed
`barrel_pct` (against)	20%	HR_rate — barrels allowed strongly predict HR allowed
`whiff_rate`	45%	K_rate — implied K = whiff_rate × 0.88
`chase_rate`	—	K + BB (zone control predictor)
`xera`	35%	ERA anchor from expected contact quality
`fb_velo`	—	Multi-year velocity aging baseline

xwOBA holistic calibration: If xwOBA implies a different offensive level than the Marcel foundation, HR_rate, SLG, and BABIP are all scaled by ratio = min(1.20, max(0.85, xwOBA / projectedwOBA)) — but only when the ratio deviation exceeds 2%.

09 Defensive Projections

OAA (Outs Above Average)

Multi-year Marcel-weighted OAA from Baseball Savant (3-year, 5/4/3 weights), regressed toward 0 using position-specific reliability:

Position	OAA r²	Notes
SS, CF	0.62	Range is the primary skill
2B	0.58
3B	0.55
RF, LF	0.52
C	0.42	Limited range opportunities
1B	0.38	Very few range plays

Catcher Framing

Framing runs from Baseball Savant runs_extra_strikes, regressed 50% toward 0. Age curve: improving through 28, peak 28–30, steep decline after 30 (~1.2 runs/yr).

Position Transitions

As players age, premium defenders move to easier positions. Interpolated over the transition window, not as a cliff:

From	To	Ages
SS	3B → 1B	28–32 to 3B, 36–39 to 1B
2B	1B	31–35
CF	LF	30–34
3B	1B	34–38

Sprint Speed Modifier

speedRetention = max(0.70, min(1.15, 0.70 + (sprint_speed / 27.0) × 0.30))

Elite speedsters (30+ ft/s) retain defensive value longer. Slow players (24 ft/s) lose range value faster.

10 Playing Time & Career Endpoints

Hitter PA Arc

Age Range	Typical Level	PA
18–20	ROK / A	300–400
21–24	A+ / AA	350–500
25–26	AAA / MLB debut	500–530
27–32	MLB prime	550–575
33+	MLB veteran	declining ~15–20 PA/yr

Comp PT blending (when ≥5 comps available):

blendedPA = formulaPA × 0.40 + compAvgPA × 0.60
blendedPA × compAttrition.pctActive  // adjusts for career-end probability

Career Endpoints

Career ends when:

1. Fewer than 20% of comparables still active at that age AND OPS < .550 (hitters) or ERA > 6.50 (pitchers)

2. Hard ceiling: age 48

Why this is better than hard thresholds: A great 38-year-old has a better chance of a Y+1 projection than a mediocre 35-year-old — empirically correct, and what actually happens in real careers.

11 WAR Framework

WAR = wRAA + BsR + Fld + Frm + Pos + Rep

// wOBA (FanGraphs linear weights)
wOBA = (BB×0.690 + HBP×0.720 + 1B×0.880 + 2B×1.245 + 3B×1.575 + HR×2.015) / PA

// wRAA (value above average)
wRAA = (wOBA - lgwOBA) / wOBAscale × PA

// BsR (baserunning)
netSB = SB - (CS × 2)
wSB   = netSB × 0.2    // ~0.2 runs per net steal

// Replacement level
hitter: 20.0 replacement runs per 600 PA
pitcher: 18.0 replacement runs per 200 IP

// Runs per win
RPW = 10.0  // FanGraphs scale

12 What We Do That Others Don't

Feature	ZiPS	Steamer	PECOTA	RoboNiner
Per-stat regression weights	✓	✓	✓	✓
Comp-driven aging (individual curves)	Partial	—	✓	✓
MLE selection-bias correction	—	—	—	✓
Per-league MLE factors	Partial	—	—	✓
Component-level park factors	Partial	Partial	—	✓
avg_ev as BABIP signal	—	—	—	✓
xISO_against (pitcher HR predictor)	—	—	—	✓
Sprint speed → defensive aging	—	—	—	✓
OAA + framing multi-year Marcel	Partial	—	—	✓
Position transition modeling	Partial	—	—	✓
Young-for-level regression discount	Partial	—	Partial	✓
Full career arc projections	Some	—	✓	✓
Full minor league coverage (A thru MLB)	✓	Partial	Partial	✓
Percentile forecasts	—	—	✓	✓

Key Innovations

1

MLE Selection Bias Correction

No public system explicitly corrects for the fact that called-up players are the top 5–10% of their level. We cap empirical MLEs at population-level bounds derived from theoretical translation ratios.

2

xISO Against as HR Predictor

xISO = xSLG − xBA measures the quality of extra-base contact allowed, not just the rate. A pitcher who allows loud doubles even when getting outs is more HR-prone than one whose extra-base hits are weak.

3

avg_ev as Independent BABIP Signal

Average exit velocity captures typical contact quality separately from barrel_pct (peak contact) and hard_hit_pct (95+ mph). Each represents a different part of the contact quality distribution, and each adds incremental predictive value.

4

Sprint Speed → SB + BABIP + Defense

Sprint speed from Baseball Savant is applied to SB rate, BABIP (via infield hit probability), and the rate of range-based fielding value decay — three independent channels from one measurement.

5

Comp-Attrition Career Endpoints

Instead of hard "retire when OPS < .500" thresholds, we track the fraction of comparables still active at each age. A great 38-year-old has a better chance of a Y+1 projection than a mediocre 35-year-old.

6

Velocity-Driven K Aging

Pitchers with known fastball velocity (from Statcast) accumulate 0.25 mph/yr decline after age 25. Each mph lost → −0.0008 K/BF. The mechanism compounds with comp-driven empirical aging curves.

13 Known Limitations & Future Work

1

No platoon split modeling (L/R). Hitters and pitchers are projected against average mix. A strong platoon advantage (e.g., a lefty masher vs. RHP) is not directly modeled.

2

No injury history adjustment. Players with Tommy John or recent soft tissue injuries should carry higher downside risk not currently reflected in the distribution.

3

Statcast coverage is MLB-only. Minor league Statcast data exists in limited form (2024+) but is not yet integrated. Minor league projections rely entirely on rate stats + MLE translations.

4

GB% not directly available. Baseball Savant's batted ball CSV has avg EV of ground balls, not ground ball rate. GB rate currently comes from regression only.

5

Per-park minor league factors are estimated by single-regression rather than true home/away splits. Accurate at the aggregate; imprecise for any individual park.

High-Priority Future Work

Platoon splits (L/R)

High impact Medium effort

Injury history modifier

High impact Medium effort

Minor league Statcast (2024+)

High impact Medium effort

True GB% endpoint (when available)

Medium impact Low effort

Umpire / environment effects

Medium impact High effort

14 Data Sources

All sources are free and public. RoboNiner is built entirely on open data.

MLB Stats API Stats for all levels 2016–present, rosters, player metadata, homeAndAway splits for park factors Daily

Baseball Savant Statcast — exit velocity, barrel%, sprint speed, OAA, catcher framing, xwOBA, whiff rate Annual

Lahman Database Historical MLB stats 1871–2024, used for comp-finding and aging curve derivation Annual

Retrosheet Historical game logs, batter/pitcher matchups for pre-Lahman validation Annual