LAMP: Lane-Topology-Guided Motion Forecasting via Feasible Motion Primitive Selection

LAMP generates diverse and lane-consistent multimodal predictions

On intersection scenarios, LAMP improves both diversity and feasibility over the MTR baseline — producing trajectories that adhere to lane topology while preserving multimodal coverage.

Qualitative comparison of LAMP vs MTR on intersection scenarios

Overall Framework of LAMP

Abstract

Motivation

Multimodal motion forecasting must be accurate, feasible, and diverse to support safe planning. However, existing predictors rely on scene-agnostic intention priors (anchors, target points) and training objectives that prioritize the most-probable ground-truth mode. As a result, lower-probability modes often violate lane topology or traffic rules, producing off-road or unreachable trajectories that are unreliable for downstream planners.

Approach

We propose LAMP (Lane-Aligned Motion Primitives), a topology-aware motion forecasting framework that improves the reliability of multimodal prediction sets by combining structured motion primitives with feasibility-aware intention selection.

Method

LAMP builds on two design principles. (i) We learn a discrete codebook of shape-aware motion primitives via an NSVQ-based VQ-VAE, capturing the full spatiotemporal dynamics of driving intentions beyond endpoint-based priors. (ii) We use HD-map reachable-lane sets as a soft teacher distribution to train a lane-topology-guided intention selector that filters infeasible intention queries before they enter the Transformer decoder.

Results

On the Argoverse 2 motion forecasting benchmark, LAMP matches strong baselines on displacement accuracy while substantially improving feasibility — particularly the planner-relevant traffic-rule metric FR (0.665 → 0.774 vs. MTR) — and diversity (DwF 7.811 → 12.559 vs. MTR). This combination of diverse, lane-compliant hypotheses provides more reliable multimodal prediction sets for downstream planning.

Experiments

Main results on Argoverse 2

LAMP matches strong baselines on displacement accuracy while substantially improving feasibility — particularly the planner-relevant FR metric — and diversity (DwF). Inference runs at 36.47 ms/scenario, comparable to MTR's 33.84 ms.

Model	b-minADE₆ ↓	b-minFDE₆ ↓	minADE₆ ↓	minFDE₆ ↓	DAC ↑	FR ↑	APD₆ ↑	FPD₆ ↑	DwF ↑
LAMP (Ours)	1.345	2.281	0.904	1.785	0.942	0.774	6.401	16.232	12.559
MTR	1.315	2.127	0.842	1.667	0.925	0.665	4.109	10.963	7.811
Wayformer	1.420	2.281	0.804	1.664	0.905	0.691	4.157	11.151	8.173
Forecast-MAE	1.379	2.125	0.751	1.492	0.937	0.692	2.983	7.728	5.657
EMP	1.385	2.141	0.749	1.503	0.934	0.684	2.936	7.608	5.524
Autobot	1.512	2.389	0.846	1.720	0.937	0.699	3.567	9.163	6.829

We compare NSVQ (LAMP's choice) against the original VQ-VAE and FSQ. NSVQ achieves the best displacement accuracy and DwF among the three, thanks to more stable optimization and better codebook utilization.

Model	b-minADE₆ ↓	b-minFDE₆ ↓	minADE₆ ↓	minFDE₆ ↓	DAC ↑	FR ↑	APD₆ ↑	FPD₆ ↑	DwF ↑
LAMP (NSVQ, Ours)	1.345	2.281	0.904	1.785	0.942	0.774	6.401	16.232	12.559
VQ-VAE	1.362	2.442	0.975	2.064	0.936	0.710	7.623	18.258	10.911
FSQ	1.351	2.363	0.912	1.852	0.974	0.797	4.936	12.896	9.968

K = 64 hits the sweet spot. K = 32 lacks the capacity to express diverse intentions; K = 128 suffers from codebook collapse caused by imbalanced updates under fixed top-L selection.

Model	b-minADE₆ ↓	b-minFDE₆ ↓	minADE₆ ↓	minFDE₆ ↓	DAC ↑	FR ↑	APD₆ ↑	FPD₆ ↑	DwF ↑
NSVQ(32)	1.340	2.334	0.942	1.947	0.938	0.759	6.065	15.992	12.514
LAMP (NSVQ-64, Ours)	1.345	2.281	0.904	1.785	0.942	0.774	6.401	16.232	12.559
NSVQ(128)	1.405	2.335	0.919	1.858	0.915	0.729	5.018	12.372	7.593

Filtering infeasible intention queries before decoding (IS, L = 16) substantially improves feasibility and diversity over no selector (L = 64). Too few selected intentions (L = 6) limits decoder capacity, hurting accuracy and DwF despite increasing raw endpoint spread.

Model	b-minADE₆ ↓	b-minFDE₆ ↓	minADE₆ ↓	minFDE₆ ↓	DAC ↑	FR ↑	APD₆ ↑	FPD₆ ↑	DwF ↑
w/o IS (L=64)	1.339	2.313	0.923	1.911	0.898	0.713	5.447	14.521	10.961
IS (L=6)	1.426	2.655	1.065	2.302	0.875	0.758	8.981	18.404	11.461
LAMP (IS, L=16, Ours)	1.345	2.281	0.904	1.785	0.942	0.774	6.401	16.232	12.559

Findings: LoRA-based map-adaptive motion primitives exhibit a feasibility–diversity trade-off

We additionally explored injecting map priors into the NSVQ decoder via Low-Rank Adaptation (LoRA). Conditioning early on map features increases lane compliance (higher DAC and FR), but induces mode contraction — intention queries collapse toward dominant lane-following behaviors — sharply reducing diversity. Because LAMP prioritizes diverse-yet-feasible hypotheses for downstream planning, we exclude LoRA from the final model.

LoRA-based map conditioning increases feasibility but collapses diversity

Model	DAC ↑	FR ↑	APD₆ ↑	FPD₆ ↑	DwF ↑
LAMP	0.942	0.774	6.401	16.232	12.559
LAMP + LoRA	0.983	0.794	3.943	10.402	8.092

BibTeX

@inproceedings{han2026lamp,
  title     = {Lane-Topology-Guided Motion Forecasting via Feasible Motion Primitive Selection},
  author    = {Han, Sangjin and Jung, Hoseong and Her, Jeongtae and Choi, Changhyun and Kim, H. Jin},
  booktitle = {TODO_VENUE},
  year      = {2026}
}