Meet D4RT, a unified AI model for 4D scene reconstruction and tracking.

DM provides a diverse range of content spanning technology, business, and culture, offering articles, interviews, and analysis for readers interested in staying updated with the latest trends and developments across various industries. Readers can learn about emerging technologies, industry insights, and  perspectives from experts in different fields.

DeepMind

D4RT is a new AI model from Google DeepMind that reconstructs 4D scenes (3D space plus time) from 2D video. Using a unified encoder-decoder Transformer architecture with a query-based mechanism, it tracks pixels through space and time while being up to 300x more efficient than previous methods. The model processes video into compressed scene representations, then uses parallel queries to answer where specific pixels are located in 3D space at any given time from any camera angle, enabling real-time applications in robotics and augmented reality.

D4RT: Unified, Fast 4D Scene Reconstruction & Tracking