Datasets for game AI

Training data
for game AI.

Labeled, curated, and raw event datasets across the top competitive titles. Versioned, typed, and ready to drop into your training pipeline.

12B+
Events ingested
2.4M+
Matches parsed
8
Patch versions tracked
6
Competitive titles

Labeled. Curated. Raw.

Three tiers of gameplay data. From human-annotated labels to raw replay files — pick what fits your workflow.

Labeled Data
Human-annotated events with verified labels. Win conditions, plays, and outcomes tagged for supervised learning.
Curated Datasets
Clean, deduplicated, and structured. Consistent schemas across all titles, ready for your pipeline.
Raw Replay Access
Download raw demos, replays, and rofl files. Full fidelity data for custom parsing and analysis.
Multiple Formats
Parquet, Arrow, and JSONL. Stream batches to PyTorch or JAX, or pull bulk shards via CLI.
Versioned & Reproducible
Immutable snapshots per patch. Runs on dataset vX always return identical rows — no silent drift.
Continuously Updated
New matches and patch versions land every day. Pin a version or always pull the latest.

Versioned datasets, one per title.

The competitive_v3 release. Shared schema, per-title features. Pin a version or always pull the latest.

DatasetRowsMatchesSizeUpdated
valorant_pro_v3
Tactical FPS · Radiant+
12.4M18,2032.1 GB2h ago
lol_challenger_v3
MOBA · Challenger
8.9M12,4111.7 GB2h ago
cs2_tier1_v3
Tactical FPS · Tier 1
9.8M11,9021.9 GB6h ago
dota2_divine_v3
MOBA · Divine+
6.7M10,8441.2 GB4h ago
fortnite_arena_v3
Battle Royale · Arena
4.1M8,7330.8 GB1d ago
rocketleague_gc_v3
Sports · GC+
5.4M6,3190.7 GB8h ago

Query-ready from day one.

Every event normalized into well-typed JSON. No cleaning, no transforms, no glue code.

train.py
from vodpop import Dataset

# Load a versioned, typed dataset
ds = Dataset.load("competitive_v3")

# Select features, stream batches to your trainer
for batch in ds.iter_batches(
    batch_size=4096,
    columns=["round_phase", "team_econ", "outcome"],
    as_tensor=True,
):
    train_step(batch)
FieldTypeDescription
event_typeenum[67]kill · round_outcome · plant · ability · purchase · trade · …
tickint64Monotonic engine tick within the match
match_idstringStable across queries, format vp_{hex}
patchstringGame patch at match time, e.g. 9.03
posfloat[3]World-space x, y, z in engine units
outcomeenum[4]win · loss · draw · no_contest
team_econint32Team credits at round start
round_phaseenum[5]buy · action · post · overtime · half
View full schema reference

Ship faster with
better training data.

Request access and our team will get back to you within 24 hours.