Best of MLOps — 2024

1
Article
Community Picks·2y
25 Open Source AI Tools to Cut Your Development Time in Half
A comprehensive overview of 25 open-source AI tools designed to streamline various stages of ML/AI projects, from data preparation to deployment and monitoring. Each tool is evaluated based on factors like popularity, impact, innovation, community engagement, and relevance to emerging AI trends. The guide aids in selecting appropriate tools by examining their unique features and suitability for specific use cases, thereby enhancing productivity and project success.
479
8
2
Article
Machine Learning Mastery·2y
7 Free Machine Learning Tools Every Beginner Should Master in 2024
Beginners in machine learning should become familiar with tools that aid in model development, data quality assessment, experiment tracking, and deployment. Seven essential tools highlighted include Scikit-learn for ML development, Great Expectations for data validation, MLflow for experiment tracking, DVC for data version control, SHAP for model explainability, FastAPI for API development and deployment, and Docker for containerization and deployment. Mastering these tools will create a comprehensive workflow for building and deploying robust models efficiently.
184
5
3
Article
Community Picks·2y
The State of Data Engineering 2024
The 2024 State of Data Engineering report discusses the influence of GenAI on software infrastructure, the expansion of product offerings due to the economic downturn, and the impact of open table formats and their catalogs in the data lake industry. It also highlights the importance of data version control and observability in AI/ML systems.
144
3
4
Article
Hugging Face·2y
Llama can now see and run on your device - welcome Llama 3.2
Llama 3.2, developed in collaboration with Meta and available on Hugging Face, includes both multimodal vision models and text-only models. The Vision models come in 11B and 90B sizes and feature strong visual reasoning capabilities. Text-only models are available in 1B and 3B sizes, optimized for on-device use. Llama 3.2 also introduces a new version of Llama Guard for input classification, including harmful prompt detection. Integration with Hugging Face Transformers and major cloud services is supported, and fine-tuning can be accomplished with a single GPU.
127
10
5
Article
Machine Learning Mastery·2y
7 Free Machine Learning Tools Every Beginner Should Master in 2024
Beginners in machine learning should familiarize themselves with essential tools to manage data, track experiments, explain models, and deploy solutions. Key tools include Scikit-learn for model development, Great Expectations for data validation, MLflow for experiment tracking, DVC for data version control, SHAP for model explainability, FastAPI for API development and deployment, and Docker for containerization. Mastering these tools ensures smooth and efficient workflows from development to production.
113
1
6
Article
SwirlAI·1y
What is AI Engineering?
AI Engineering is a rapidly evolving role focused on developing and deploying AI systems that utilize Large Language Models (LLMs) to solve business problems. AI Engineers differ from Software Engineers and Machine Learning Engineers in that they deal extensively with non-deterministic systems and require skills in prompt engineering, infrastructure, and data integration. The field is witnessing the rise of Agentic systems, which are advanced AI systems capable of performing complex tasks with a degree of autonomy. AI Engineering is poised to become one of the most in-demand roles in the tech industry with high salaries and growing opportunities.
78
2
7
Video
freeCodeCamp·2y
End-to-End Machine Learning Project – AI, MLOps
The post provides a comprehensive guide on undertaking an end-to-end machine learning project focused on house price prediction. It delves into core machine learning concepts, data analysis, feature engineering, and model implementation with robust testing. Additionally, it emphasizes MLOps integrations using tools like ZenML and MLFlow for experiment tracking and deployment. The tutorial also underscores the importance of writing scalable and readable code by employing design patterns such as Factory and Strategy patterns. The project aims to differentiate itself by focusing on thorough data understanding and robust implementation practices, promising to enhance one's data science portfolio and career prospects.
76
8
Article
Medium·2y
From Data Collection to Deployment: Mastering the Data Science Workflow
Data science has evolved into a critical tool for strategic decision-making. The workflow from data collection to deployment is not linear but iterative. Key steps include defining the problem, gathering and cleaning data, conducting exploratory data analysis, feature engineering, model selection, training and tuning, evaluating performance, and finally deploying the model. Effective communication of results to stakeholders is also vital.
49
1
9
Article
Daily Dose of Data Science | Avi Chawla | Substack·2y
4 Ways to Test ML Models in Production
Testing ML models in production is crucial to ensure reliability and performance on real-world data. Four common strategies are A/B testing, canary testing, interleaved testing, and shadow testing. A/B testing distributes requests non-uniformly between models, while canary testing gradually rolls out the candidate model to a subset of users. Interleaved testing mixes predictions from both models, and shadow testing logs outputs without affecting user experience. These techniques help mitigate risks and validate the model effectively.
46
10
Article
Machine Learning Mastery·2y
A Roadmap for Your Machine Learning Career
Looking to make a career in machine learning? This guide offers a structured approach, starting with basics such as scikit-learn and advancing to frameworks like TensorFlow or PyTorch. It emphasizes solving real-world problems, learning software engineering skills, and understanding model deployment. Key steps include version control, clean code, CI/CD pipelines, and cloud deployment. A robust portfolio showcasing diverse ML projects and preparation for various interview phases will further bolster your journey. Continuous learning and networking are vital for long-term success in this dynamic field.
43
1
11
Article
Machine Learning Mastery·2y
Building a Robust Machine Learning Pipeline: Best Practices and Common Pitfalls
A machine learning pipeline is essential for operating models and delivering value. For robustness, it's crucial to structure the pipeline well and maintain reliability at each stage, even with changing environments. Some key pitfalls to avoid include ignoring data quality, overcomplicating models, inadequate monitoring, and not versioning data and models. Best practices involve using appropriate model evaluation metrics, employing MLOps for deployment and monitoring, and preparing comprehensive documentation.
36
12
Article
Towards AI·1y
How to Deploy ML Models in Production (Flawlessly)
When deploying machine learning models in production, it is crucial to focus on reliability, scalability, security, and maintainability. Using version control systems helps track different versions of your models, ensuring you can revert to stable versions if issues arise. The post offers insights into achieving reliable deployment for ML models in production environments.
30
13
Article
Towards AI·2y
MLOps Without Magic
This post provides a detailed guide on implementing intermediate MLOps using simple Python code, without relying on specific MLOps frameworks like MLflow or DVC. Key sections include setting up a project structure with designated folders for data, models, and results, using command line tools for preprocessing, training, and predicting, and managing experiments using a script called tasks.py. The guide emphasizes simplicity, maintainability, and effectiveness, suitable for both local and cloud-based workflows.
30
14
Article
GoPenAI·2y
MLOps All You Need To Know
MLOps integrates machine learning (ML) development and operations, emphasizing automation and monitoring across the ML lifecycle. It is a specialized extension of DevOps tailored for ML systems, involving complex tasks such as continuous training and comprehensive testing. The maturity of MLOps pipelines is classified into multiple levels, with a continuous feedback loop for deploying, monitoring, and analyzing models in production.
26
15
Article
Medium·2y
Setting A Dockerized Python Environment — The Hard Way
This post reviews different methods to run a dockerized Python environment from the command line (CLI). It explains how to customize a built-in image using a Dockerfile and mount a local folder to the container for code maintenance.
23
1
16
Article
TensorFlow·1y
MLSysBook.AI: Principles and Practices of Machine Learning Systems Engineering
Machine learning (ML) systems engineering is crucial for transforming sophisticated models into robust, scalable, and efficient systems. MLSysBook.ai fills the educational gap by providing practical insights and resources on ML infrastructure, optimization, deployment, and maintenance, with examples tied to the TensorFlow ecosystem. An interactive learning assistant, SocratiQ, enhances this resource by offering personalized guidance. Understanding both ML modeling and system engineering is key to creating impactful AI solutions.
22
17
Article
AWS·2y
LLM experimentation at scale using Amazon SageMaker Pipelines and MLflow
Large language models (LLMs) have shown success in NLP but need customization to adapt to specific tasks or domains. This post explores how Amazon SageMaker and MLflow can simplify the process of fine-tuning LLMs at scale using SageMaker Pipelines. By integrating MLflow, you can manage experiment tracking, model versioning, and deployment, enabling easier comparison of multiple LLM experiments. The post provides a step-by-step guide and source code to streamline fine-tuning, evaluation, and deployment of models like Llama 3 using SageMaker and MLflow.
21
18
Article
SwirlAI·2y
Memory in Agent Systems
The post explores the implementation and importance of memory in generative AI agent systems. It covers different memory types, including short-term and long-term memory, and their roles. Short-term memory provides context during interactions, while long-term memory, split into episodic, semantic, and procedural types, ensures continuity and relevance of information. The author emphasizes the necessity of efficient memory management in agentic architectures.
20
19
Article
GitLab·2y
Build an ML app pipeline with GitLab Model Registry using MLflow
This tutorial guides you through setting up an MLOps pipeline using GitLab Model Registry and MLflow. It explains the importance of MLOps in managing and automating machine learning models' lifecycle, highlighting GitLab's features like version control, CI/CD pipelines, and collaboration tools. The tutorial includes instructions for setting up environment variables, training and logging models, registering successful candidates, and deploying an ML app using Docker.
19
20
Article
Medium·2y
How to Succeed as a Machine Learning Engineer in the Industry
Kartik Singhal, a Senior Machine Learning Engineer at Meta, shares five key tips for excelling in the field. The advice includes building a solid foundation in machine learning fundamentals, leveraging strengths, aligning models with business goals, understanding ROI and trade-offs, and embracing continuous experimentation. Additionally, mentorship and networking are highlighted as crucial for career growth.
19
21
Article
KDnuggets·2y
5 Best End-to-End Open Source MLOps Tools
Explore 5 end-to-end open-source MLOps tools for training, tracking, deploying, and monitoring models in production. These tools provide enhanced data privacy and control over models and code.
19
22
Article
Community Picks·2y
Free Online Tutorials to Help You Develop Machine Learning Applications
Machine learning and data science offer immense potential, with a 23% growth rate for ML engineers since 2022. However, finding quality free resources for foundational and advanced topics can be challenging. This post introduces ten free machine learning tutorials and platforms and discusses factors like course content, instructor expertise, and cost to consider when selecting a learning resource. Notable courses include guides from Jozu Learning, WorldQuant University, freeCodeCamp, Kaggle Learn, YouTube channels, and more.
18
23
Article
Community Picks·2y
10 Open Source Tools for Building MLOps Pipelines
This post explores 10 open source MLOps tools for building an MLOps pipeline, including KitOps, Hydra, Data Version Control (DVC), Airflow, Continuous Machine Learning (CML), Hyperopt, Weights and Biases, MLflow, NannyML, and Metaflow.
18
24
Article
Community Picks·2y
Building an MLOps pipeline with Dagger.io and KitOps
Over 85% of machine learning models never reach production due to the disconnect between data scientists, ML engineers, and DevOps engineers. MLOps pipelines address this by integrating version control, CI/CD, model monitoring, and integration testing. Dagger.io offers a way to define pipelines as code and integrate CI/CD and monitoring, while KitOps simplifies the packaging and management of model dependencies. This guide provides steps to create an ML pipeline using these tools, from setting up prerequisites to integrating with CI/CD platforms like GitHub Actions.
17
25
Article
Towards AI·2y
Build end to end CICD pipeline using GitHub Actions-MLOps
Learn how to build a comprehensive end-to-end CI/CD pipeline using GitHub Actions, Docker, and Cloud. The post covers pulling the production model from AWS RDS in MLflow and deploying the container locally and to the cloud.
17

See all MLOps archives