A multi-stage deep learning system for predicting NFL player trajectories over 30 frames from tracking data.
Competition: NFL Big Data Bowl 2026 - Prediction
Given the last observed frame of a play (22 players with positions, velocities, orientations, and contextual features), the system predicts the (x, y) trajectory of a specified target player for the next 30 frames. The task is inherently multi-agent: each player's future path depends on the spatial configuration and motion of all other players, especially defensive coverage. A receiver running a curl vs. a go route is largely determined by whether the defense plays zone or man, how deep the safeties are, and where the nearest defender is closing from.
Mechanism: The 22-player configuration is compressed into a 256-dimensional "defensive intent" vector before the trajectory predictor sees it. This is not a simple dimensionality reduction—it is an information bottleneck that forces the encoder to learn a high-level abstraction of tactical state.
Why it works: In football, the defense largely determines the offense's route structure. Zone coverage implies different spacing and movement than man coverage; blitz pressure changes timing; safety depth affects how deep routes develop. A raw 22×12 matrix of positions is noisy and redundant. The bottleneck forces the model to extract what matters: coverage scheme, pressure points, gaps, and closing angles. The trajectory predictor then conditions on this abstract representation rather than raw coordinates, which improves generalization and reduces overfitting to low-level geometry.
Implementation: Stage 1 encoder (Transformer + GlobalAveragePooling1D → Dense head) outputs 256 dims. Stage 2 receives this as input_defensive_intent and projects it to a fused context vector. The encoder is frozen during Stage 2 training so the trajectory model cannot bypass the bottleneck.
Mechanism: The model predicts displacements (Δx, Δy) per frame, not absolute (x, y). The final trajectory is obtained by cumulative summation from the player's last observed position.
Why it works:
- Relative motion is easier to learn: The model learns "how do I move from here" rather than "where am I on the field." The latter is position-dependent and requires the model to implicitly learn field geometry.
- Invariance to starting position: A 5-yard out route looks the same regardless of whether the receiver starts at x=20 or x=80. Delta prediction is invariant to translation.
- Temporal consistency: Each step builds on the previous. Absolute prediction would require the model to implicitly integrate; delta prediction makes the integration explicit.
Mechanism: Stage 1 is trained (or supervised) to produce a useful intent representation; Stage 2 is trained with the encoder frozen, using precomputed intent.
Why it works: Joint training could lead to shortcut learning: the trajectory model might ignore the intent and rely on raw player positions. Freezing the encoder forces the trajectory model to actually use the intent. The bottleneck also prevents the trajectory model from overfitting to raw geometry—it must work with the compressed representation. Precomputing intent to memmap avoids repeated forward passes during Stage 2 training.
Mechanism: The 22 players are treated as a set of tokens. Self-attention computes pairwise interactions between all players; each player's representation is updated by attending to others.
Why it works: The 22 players form a set, not a sequence—ordering is imposed by role priority (Targeted Receiver, Passer, Route Runners, Coverage, Pass Rusher, Blocking) for consistent indexing, but the underlying structure is relational. Multi-head attention can attend to different aspects: nearest defender, ball landing spot, receiver coverage, pass rush lanes. The FFN after attention refines each player's representation. GlobalAveragePooling1D then aggregates over players into a single vector—the intent is a summary of the entire formation.
Mechanism: The decoder is a 2-layer GRU. Its initial hidden state is the fused vector (context + intent + player index). The decoder consumes the previous step's output (or teacher-forced ground truth) and produces the next delta.
Why it works: Trajectory is sequential—each frame depends on the previous. The GRU's recurrent structure naturally models this. The initial state is the "plan": the context encodes where everyone is, the intent encodes what the defense is doing, and the player index encodes who we're predicting. The decoder then unrolls this plan over 30 steps. Teacher forcing during training provides stable gradients; at inference, zeros are used for the first step.
Mechanism: In addition to masked RMSE, the loss includes a direction cosine term: for each step, the cosine similarity between predicted and true step vectors. A penalty of (1 - cos θ) is applied when the predicted heading differs from the true heading.
Why it works: RMSE alone can produce trajectories that hit the right endpoints but zigzag unnaturally. Two paths with the same start and end can have very different "paths"—one smooth, one jagged. The direction loss penalizes wrong heading at each step, encouraging smooth, physically plausible motion. The 0.1 weight balances endpoint accuracy with path quality.
Mechanism: Optional penalties on velocity magnitude, acceleration, and boundary violation. Velocity is capped at ~1.2 yards/frame (scaled); acceleration penalizes jerk; boundary penalty keeps players within field bounds.
Why it works: Real players cannot teleport. A model trained only on RMSE might produce trajectories that cut through defenders or jump unrealistic distances. Velocity and acceleration penalties discourage such behavior. The start penalty encourages small first-step deltas (momentum continuity from the last observed frame).
Mechanism: Plays with play_direction == 'left' are mirrored so that x and y are flipped; the offense is effectively always moving right.
Why it works: Left and right plays are mirror images of the same tactical situation. Without rotation, the model sees two "different" scenarios for the same coverage and route. Rotating halves the effective input space and improves generalization.
Mechanism: Players are sorted by role priority (Targeted Receiver first, then Passer, Route Runners, Coverage, Pass Rusher, Blocking) before feeding to the model.
Why it works: The model needs a consistent mapping from player index to role. Without ordering, the same role could appear at different indices across plays. Role-based ordering gives semantic meaning to indices: the trajectory model knows that index 0 is typically the target receiver. Frames with <22 players are padded; >22 are truncated.
Mechanism: An optional discriminator (LSTM) classifies real vs. predicted trajectories. The generator (Stage 2) is trained to fool the discriminator while minimizing trajectory loss.
Why it works: RMSE minimizes average error but can produce "average" trajectories—e.g., cutting through defenders or taking unrealistic shortcuts. The discriminator learns what real trajectories look like: smooth, avoiding defenders, following route patterns. Adversarial training pushes the generator toward this distribution, producing more plausible paths even when the RMSE is similar.
| Component | Purpose |
|---|---|
| Masked RMSE | Endpoint accuracy; only valid frames contribute |
| Direction cosine (0.1×) | Smooth, correct heading at each step |
| Velocity penalty | Cap ~1.2 yards/frame (scaled) |
| Acceleration penalty | Discourage jerk |
| Boundary penalty | Keep players on field |
| Start penalty | Small first step (momentum continuity) |
| Adversarial (GAN) | Push toward distribution of real trajectories |
| Module | File | Role |
|---|---|---|
| TransformerBlock | transformer_encoder.py |
Multi-head self-attention + FFN |
| Defensive encoder | transformer_encoder.py |
Stage 1 encoder |
| Conditional predictor | conditional_trajectory.py |
Stage 2 seq2seq |
| GAN discriminator | gan_trajectory.py |
Stage 3 adversarial component |
| Preprocessing | preprocess.py |
Feature extraction, rotation, scaling |
| Data loading | data.py |
Memmap, train/val split, coverage injection |