LAMP: Lane-Topology-Guided Motion Forecasting via Feasible Motion Primitive Selection

1Seoul National University2Hyundai Motor Company
Equal contribution   * Corresponding author

TL;DR: LAMP produces multimodal trajectory predictions that are both diverse and lane-compliant, by integrating discrete motion primitives with lane-topology constraints — yielding feasible and diverse hypotheses for safe downstream planning.

LAMP generates diverse and lane-consistent multimodal predictions

On intersection scenarios, LAMP improves both diversity and feasibility over the MTR baseline — producing trajectories that adhere to lane topology while preserving multimodal coverage.

Qualitative comparison of LAMP vs MTR on intersection scenarios

Overall Framework of LAMP

Abstract

Motivation

Multimodal motion forecasting must be accurate, feasible, and diverse to support safe planning. However, existing predictors rely on scene-agnostic intention priors (anchors, target points) and training objectives that prioritize the most-probable ground-truth mode. As a result, lower-probability modes often violate lane topology or traffic rules, producing off-road or unreachable trajectories that are unreliable for downstream planners.

Approach

We propose LAMP (Lane-Aligned Motion Primitives), a topology-aware motion forecasting framework that improves the reliability of multimodal prediction sets by combining structured motion primitives with feasibility-aware intention selection.

Method

LAMP builds on two design principles. (i) We learn a discrete codebook of shape-aware motion primitives via an NSVQ-based VQ-VAE, capturing the full spatiotemporal dynamics of driving intentions beyond endpoint-based priors. (ii) We use HD-map reachable-lane sets as a soft teacher distribution to train a lane-topology-guided intention selector that filters infeasible intention queries before they enter the Transformer decoder.

Results

On the Argoverse 2 motion forecasting benchmark, LAMP matches strong baselines on displacement accuracy while substantially improving feasibility — particularly the planner-relevant traffic-rule metric FR (0.665 → 0.774 vs. MTR) — and diversity (DwF 7.811 → 12.559 vs. MTR). This combination of diverse, lane-compliant hypotheses provides more reliable multimodal prediction sets for downstream planning.

Experiments

Main results on Argoverse 2

LAMP matches strong baselines on displacement accuracy while substantially improving feasibility — particularly the planner-relevant FR metric — and diversity (DwF). Inference runs at 36.47 ms/scenario, comparable to MTR's 33.84 ms.

Model b-minADE6 b-minFDE6 minADE6 minFDE6 DAC ↑ FR ↑ APD6 FPD6 DwF ↑
LAMP (Ours) 1.3452.2810.9041.785 0.9420.774 6.40116.23212.559
MTR 1.3152.1270.8421.667 0.9250.665 4.10910.9637.811
Wayformer 1.4202.2810.8041.664 0.9050.691 4.15711.1518.173
Forecast-MAE 1.3792.1250.7511.492 0.9370.692 2.9837.7285.657
EMP 1.3852.1410.7491.503 0.9340.684 2.9367.6085.524
Autobot 1.5122.3890.8461.720 0.9370.699 3.5679.1636.829

Ablation study

We compare NSVQ (LAMP's choice) against the original VQ-VAE and FSQ. NSVQ achieves the best displacement accuracy and DwF among the three, thanks to more stable optimization and better codebook utilization.

Model b-minADE6 b-minFDE6 minADE6 minFDE6 DAC ↑ FR ↑ APD6 FPD6 DwF ↑
LAMP (NSVQ, Ours) 1.3452.2810.9041.785 0.9420.774 6.40116.23212.559
VQ-VAE 1.3622.4420.9752.064 0.9360.710 7.62318.25810.911
FSQ 1.3512.3630.9121.852 0.9740.797 4.93612.8969.968

K = 64 hits the sweet spot. K = 32 lacks the capacity to express diverse intentions; K = 128 suffers from codebook collapse caused by imbalanced updates under fixed top-L selection.

Model b-minADE6 b-minFDE6 minADE6 minFDE6 DAC ↑ FR ↑ APD6 FPD6 DwF ↑
NSVQ(32) 1.3402.3340.9421.947 0.9380.759 6.06515.99212.514
LAMP (NSVQ-64, Ours) 1.3452.2810.9041.785 0.9420.774 6.40116.23212.559
NSVQ(128) 1.4052.3350.9191.858 0.9150.729 5.01812.3727.593

Filtering infeasible intention queries before decoding (IS, L = 16) substantially improves feasibility and diversity over no selector (L = 64). Too few selected intentions (L = 6) limits decoder capacity, hurting accuracy and DwF despite increasing raw endpoint spread.

Model b-minADE6 b-minFDE6 minADE6 minFDE6 DAC ↑ FR ↑ APD6 FPD6 DwF ↑
w/o IS (L=64) 1.3392.3130.9231.911 0.8980.713 5.44714.52110.961
IS (L=6) 1.4262.6551.0652.302 0.8750.758 8.98118.40411.461
LAMP (IS, L=16, Ours) 1.3452.2810.9041.785 0.9420.774 6.40116.23212.559

Findings: LoRA-based map-adaptive motion primitives exhibit a feasibility–diversity trade-off

We additionally explored injecting map priors into the NSVQ decoder via Low-Rank Adaptation (LoRA). Conditioning early on map features increases lane compliance (higher DAC and FR), but induces mode contraction — intention queries collapse toward dominant lane-following behaviors — sharply reducing diversity. Because LAMP prioritizes diverse-yet-feasible hypotheses for downstream planning, we exclude LoRA from the final model.

LoRA-based map conditioning increases feasibility but collapses diversity
Model DAC ↑ FR ↑ APD6 FPD6 DwF ↑
LAMP 0.9420.774 6.40116.23212.559
LAMP + LoRA 0.9830.794 3.94310.4028.092

BibTeX

@inproceedings{han2026lamp,
  title     = {Lane-Topology-Guided Motion Forecasting via Feasible Motion Primitive Selection},
  author    = {Han, Sangjin and Jung, Hoseong and Her, Jeongtae and Choi, Changhyun and Kim, H. Jin},
  booktitle = {TODO_VENUE},
  year      = {2026}
}