GRASP: Making Long-Horizon Planning Practical with Gradient-Based World Models
Introduction
Large, learned world models are increasingly capable of predicting long sequences of future observations in high-dimensional visual spaces, generalizing across tasks in ways that were unimaginable a few years ago. However, having a powerful predictive model is not the same as using it effectively for control, learning, or planning. Long-horizon planning with modern world models remains fragile: optimization becomes ill-conditioned, non-greedy structure creates bad local minima, and high-dimensional latent spaces introduce subtle failure modes. This article introduces GRASP, a gradient-based planner that addresses these challenges by making long-horizon planning practical through three key innovations.

What is a World Model?
Today, the term world model is overloaded. It can refer to an explicit dynamics model or an implicit internal state that a generative model relies on. For our purposes, a world model is a learned model that, given the current state and a sequence of future actions, predicts what will happen next. Formally, it defines a predictive distribution over observed states and actions, approximating the environment's dynamics. These models are becoming general-purpose simulators, but leveraging them for planning requires overcoming significant optimization hurdles.
The Challenge of Long-Horizon Planning
When planning over many time steps, gradient-based methods face several obstacles. First, the optimization landscape becomes ill-conditioned, making it hard to find good solutions. Second, non-greedy structures lead to poor local minima that trap the optimizer. Third, high-dimensional latent spaces—common in vision-based models—cause gradients to propagate through brittle state-input pathways, resulting in noisy or vanishing signals. These issues are amplified as the planning horizon extends, making naive gradient descent impractical.
GRASP: A Robust Gradient-Based Planner
GRASP tackles these problems with three core ideas, each addressing a specific weakness of existing approaches.
Lifting Trajectories into Virtual States
Instead of optimizing actions one step at a time, GRASP lifts the entire trajectory into a set of virtual states. This allows optimization to be parallelized across time, significantly accelerating computation and improving gradient flow. By treating each time step as an independent optimization variable, the planner avoids sequential dependency issues that cause long-horizon planning to become intractable.

Adding Stochasticity for Exploration
To escape poor local minima, GRASP injects stochasticity directly into the state iterates during optimization. This noise acts as a form of exploration, allowing the planner to sample diverse trajectories and avoid getting stuck in suboptimal regions. The stochasticity is carefully balanced to maintain stability while promoting discovery of better solutions.
Reshaping Gradients for Clean Signals
One of the biggest bottlenecks in gradient-based planning is the gradient signal flowing from high-dimensional observations (like images) to actions. GRASP reshapes these gradients to avoid the brittle state-input pathways that plague vision models. By decoupling the gradient computation from the raw observation model, actions receive clean, useful signals that guide optimization effectively.
Conclusion
GRASP demonstrates that gradient-based planning can be made robust for long horizons through careful design. By combining virtual state lifting, stochastic exploration, and gradient reshaping, it overcomes the fragility that has limited previous methods. This work, done with Mike Rabbat, Aditi Krishnapriyan, Yann LeCun, and Amir Bar, opens the door to more effective use of powerful world models in planning and control tasks. Future directions include extending these ideas to even longer horizons and more complex environments.
Related Articles
- 7 Key Insights into Apple's Privacy-First Siri: Auto-Deleting Chats and More
- Bringing Light to Rural Cameroon: How IEEE Smart Village and Local Innovation Are Powering Change
- Comparing the Galaxy Z Fold 7 and Motorola Razr Fold: Which Foldable Wins?
- Microsoft Unleashes Agentic AI for R&D: Microsoft Discovery Expands Preview Access
- Mars Odyssey's Silver Jubilee: A Global Map Celebration
- AI Revolutionizes Exoplanet Discovery: RAVEN Unearths Over 100 Hidden Worlds in TESS Data
- How Astronomers Uncovered the Milky Way's Hidden Magnetic Twist
- T-Mobile Expands Satellite Roaming: 7 Things You Need to Know About Connectivity in Canada and New Zealand