Netflix's editorial teams face a needle-in-a-haystack problem when searching thousands of hours of raw footage. This post details the architecture behind their multimodal video search system, which fuses outputs from multiple specialized AI models (character recognition, scene detection, dialogue transcription) into a unified,
Sort: