Meta AI Releases V-JEPA 2: Open-Source Self-Supervised World Models for Understanding, Prediction, and Planning
📖 Article Preview
Meta AI has unveiled V-JEPA 2, an open-source, scalable world model capable of learning from over 1 million hours of internet video and images to enhance visual understanding, future state prediction, and zero-shot planning. Building on the joint-embedding predictive architecture (JEPA), V-JEPA 2 employs self-supervised learning through a visual mask denoising objective, enabling the model to reconstruct masked spatiotemporal patches in a latent space, thereby focusing on scene dynamics while ignoring noise. To achieve this scale, Meta researchers developed techniques such as constructing a large dataset (VideoMix22
Read the Complete Article
Get the full story with in-depth analysis, expert insights, and comprehensive coverage from the original source.
Stay Informed
Get the latest AI insights and breakthroughs delivered to your inbox weekly.
We respect your privacy. Unsubscribe at any time. Privacy Policy