Best of Computer VisionJuly 2025

  1. 1
    Article
    Avatar of elixirstatusElixirStatus·40w

    Fine-Tuning YOLO to Watch Soccer Matches

    Fine-tuning pre-trained YOLO models for specialized object detection tasks requires significantly less data and training time than building from scratch. Using a soccer dataset with 7,010 training images, the author demonstrates how to adapt a COCO-trained YOLOv11 model to detect balls, players, referees, and goalkeepers with 88% mAP50 accuracy. The process involves using Ultralytics tools for training, monitoring key metrics like loss values and mAP50, and converting the final PyTorch model to ONNX format for deployment in Elixir applications. The fine-tuned model shows superior contextual understanding compared to generic models, focusing on field action while filtering out background spectators.

  2. 2
    Article
    Avatar of 2ap5reqdwgqext5qgqtkwTinashe Mutuswa·39w

    'Influencer' who garnered 165k followers while 'enjoying Wimbledon' is revealed as an AI creation

    An AI-generated Instagram influencer named Mia Zelu fooled 165,000 followers into believing she was a real person attending Wimbledon. The fake account garnered over 40,000 likes on posts showing her at the tennis tournament, even deceiving celebrities like cricketer Rishabh Pant. Despite being labeled as an 'AI storyteller' in her profile, many followers continue to interact with the account as if she were real, highlighting the sophistication of AI-generated personas and their potential to deceive social media users.

  3. 3
    Video
    Avatar of codingwithlewisCoding with Lewis·42w

    3 Insane Algorithms Netflix Uses to Scan BILLIONS of Frames

    Netflix uses three sophisticated computer vision algorithms to analyze billions of video frames: match cut transitions that automatically find visually similar shots for seamless editing, video search capabilities that convert text queries into mathematical embeddings to find specific scenes, and scene detection that combines screenplay alignment with multimodal analysis of video and audio tracks. These systems leverage instance segmentation, optical flow, and bidirectional neural networks to automate video editing tasks that would otherwise require thousands of manual hours.