Don’t try this in your data science interviews.

Minimaxir's blog is a hub for machine learning enthusiasts, offering tutorials, project showcases, and insights into the latest trends in AI and data science. With a focus on practical applications of machine learning, Minimaxir shares tips, tools, and resources for building and deploying ML models. Developers can learn about deep learning frameworks, natural language processing techniques, and AI-powered creativity, gaining  skills to tackle real-world problems.

Max Woolf's Blog

A comprehensive exploration of using text embeddings to predict IMDb movie ratings, comparing traditional statistical models, neural networks, and training LLMs from scratch. The author processes IMDb datasets using Polars, generates embeddings with ModernBERT, and evaluates multiple modeling approaches including Support Vector Machines, MLPs, and custom transformer models. Results show that training a small LLM from scratch on raw JSON movie data achieved the best performance with an MSE of 1.026, outperforming both traditional models and pretrained embedding approaches.

Predicting Average IMDb Movie Ratings Using Text Embeddings of Movie Metadata

The Initial Assignment and “Feature Engineering” #

Creating And Visualizing the Movie Embeddings #