Dyna-Think: Synergizing Reasoning, Acting, and World Model Simulation in AI Agents
The paper introduces Dyna-Think, a framework that combines planning, world modeling, reasoning, and acting to improve AI agent performance in long-horizon tasks, building upon large language models like DeepSeek-R1. Through imitation learning and a two-stage training process, Dyna-Think enhances world modeling and policy performance, achieving comparable results to R1 with fewer tokens and demonstrating that better world models lead to improved reasoning and planning capabilities.