D4RT is a new AI model from Google DeepMind that reconstructs 4D scenes (3D space plus time) from 2D video. Using a unified encoder-decoder Transformer architecture with a query-based mechanism, it tracks pixels through space and time while being up to 300x more efficient than previous methods. The model processes video into
•3m read time• From deepmind.google
Sort: