Tencent AI Lab introduces GPT4Video, a unified multimodal large language model for instruction-followed understanding and safety-aware generation. GPT4Video enhances LLMs with advanced video understanding and generative functions, outperforming other models in video question answering and text-to-video generation tasks. The release of a specialized multimodal instruction dataset is expected to catalyze future research in the field.
Sort: