Best of Audio ProcessingAugust 2024

  1. 1
    Article
    Avatar of hnHacker News·2y

    JoinMusic/fish: YouTube video to chords, lyrics, beat and melody.

    An AI-powered multimodal project employs various transformer models to generate chords, beats, lyrics, melody, and tabs for any song from YouTube videos. The system includes models like U-Net for audio separation, Pitch-Net for melody tracking, Beat-Net for tempo tracking, and Chord-Net for chord recognition. It supports multiple languages and allows for editable sheet music creation. Utilizing a combination of STFT, MFCC, and chroma features, it ensures better generalization with minimal training data.

  2. 2
    Article
    Avatar of dailydoseofdsDaily Dose of Data Science | Avi Chawla | Substack·2y

    The Utility of Vector Databases in LLMs

    Explore the surge in interest for vector databases, which are essential for storing and querying unstructured data. These databases are now crucial for enhancing the functionality of Large Language Models (LLMs) by embedding unstructured data into high-dimensional vectors. Learn how AssemblyAI's LeMUR framework simplifies building LLM apps on audio data, and how vector databases avoid the need for costly and complex fine-tuning of LLMs.