Best of Data AnalysisMay 2024

  1. 1
    Article
    Avatar of dailydoseofdsDaily Dose of Data Science | Avi Chawla | Substack·2y

    6 Elegant Jupyter Hacks

    Discover 6 elegant Jupyter hacks to improve your experience. Learn how to retrieve a cell's output, enrich the default preview of a DataFrame, generate helpful hints as you write Pandas code, improve rendering of DataFrames, restart the Jupyter kernel without losing variables, and search code in all Jupyter Notebooks from the terminal.

  2. 2
    Article
    Avatar of kdnuggetsKDnuggets·2y

    Top SQL Queries for Data Scientists

    Learn about the main SQL concepts for data scientists, including querying and filtering data, working with NULLs, data type conversion, data aggregation, and more.

  3. 3
    Article
    Avatar of earthlyEarthly·2y

    Top 10 Python Libraries for Data Science

    This post explores the top Python libraries for data science, including libraries for data acquisition, data analysis and processing, machine learning, and data visualization.

  4. 4
    Article
    Avatar of kdnuggetsKDnuggets·2y

    5 Simple Steps to Automate Data Cleaning with Python

    Learn how to automate the data cleaning process with a 5-step pipeline in Python. The pipeline includes steps for identifying data format, removing duplicates, handling missing values, and dealing with outliers.

  5. 5
    Article
    Avatar of inPlainEngHQPython in Plain English·2y

    Creating an ETL Data Pipeline Using Bash with Apache Airflow

    Learn how to create an ETL data pipeline using bash with Apache Airflow. Extract data from various file formats, transform it, and load it into a new file. Includes steps for starting Apache Airflow, downloading the dataset, creating a DAG, and executing the pipeline.

  6. 6
    Article
    Avatar of rpythonReal Python·2y

    How to Create Pivot Tables With pandas – Real Python

    Learn how to create pivot tables with pandas in Python. Pivot tables are a data analysis tool used to summarize and analyze data. The tutorial covers the basics of creating pivot tables, including grouping, aggregating, and formatting the data.

  7. 7
    Article
    Avatar of kdnuggetsKDnuggets·2y

    Where to Go Next in Your Data Career

    Learn about the different classes in the data career landscape, typical migration paths between roles, and how to pick a career track in the data field.

  8. 8
    Article
    Avatar of kdnuggetsKDnuggets·2y

    3 Courses You Should Consider If You Want to Become a Data Analyst

    This post discusses three different courses that individuals can consider taking if they want to become a data analyst. It highlights courses offered by DataCamp, Meta, and Google, providing information on the skills and knowledge that can be gained from each.

  9. 9
    Article
    Avatar of thevergeThe Verge·2y

    Custom GPTs open for free ChatGPT users

    Free ChatGPT users now have access to custom GPTs, data analytics, chart creation, and other features that were previously only available to paid subscribers.

  10. 10
    Article
    Avatar of hnHacker News·2y

    quarylabs/quary: Open-source BI for engineers

    Quary is an open-source BI tool for engineers that allows you to connect to databases, write SQL queries to transform and organize data, create charts and dashboards, and deploy the model back to the database. It can be installed as a VSCode extension or a Rust-based CLI.

  11. 11
    Article
    Avatar of communityCommunity Picks·2y

    The odds of getting a remote job are less than 1% (because everyone wants one)

    Getting a remote job is difficult, with odds of less than 1%. On average, remote job posts receive 322 reads and 47 applicants. The top 5% of remote jobs receive 2858 reads and 263 applicants.

  12. 12
    Article
    Avatar of bennadelBen Bandel·2y

    Using Multiple Common Table Expressions In One SQL Query In MySQL

    Learn how to use multiple Common Table Expressions in one SQL query in MySQL. Discover the benefits of using Common Table Expressions for reporting and data introspection tasks.

  13. 13
    Article
    Avatar of medium_jsMedium·2y

    Understand SQL Window Functions Once and For All

    Learn about window functions in SQL, including how they work and the syntax for using them. Discover how window functions can be used to group and aggregate data while keeping the dataset intact.

  14. 14
    Article
    Avatar of gopenaiGoPenAI·2y

    Leveraging LangChain and Streamlit for Interactive CSV Analysis

    Learn how to leverage LangChain and Streamlit to build an interactive CSV analysis tool. Simplify and expedite CSV data analysis while enhancing productivity.

  15. 15
    Article
    Avatar of devblogsDevBlogs·2y

    Announcing Data Wrangler: Code-centric viewing and cleaning of tabular data in Visual Studio Code

    Announcing Data Wrangler: Code-centric viewing and cleaning of tabular data in Visual Studio Code. Data Wrangler is a free extension that offers data viewing and cleaning directly integrated into VS Code and the Jupyter extension. It provides a rich user interface for data analysis and transformation, with features such as column statistics, visualizations, and automatic code generation.

  16. 16
    Article
    Avatar of kdnuggetsKDnuggets·2y

    Feature Engineering for Beginners

    This guide introduces key techniques in feature engineering, including handling missing values, encoding categorical variables, and scaling and normalizing data. It also covers advanced techniques such as feature creation, dimensionality reduction, and time series feature engineering. The post provides practical examples in Python and offers practical tips and best practices.

  17. 17
    Article
    Avatar of rpythonReal Python·2y

    Flattening a List of Lists in Python – Real Python

    Learn how to flatten a list of nested lists in Python in this video course. Convert a multidimensional list into a one-dimensional list.

  18. 18
    Article
    Avatar of substackSubstack·2y

    How to Build a Data Analytics Portfolio

    Learn how to build a data analytics portfolio by including projects that showcase skills like exploratory data analysis, data visualization, and business acumen. Use GitHub to organize your projects and create a comprehensive ReadMe page to highlight your accomplishments, methodology, and business recommendations.

  19. 19
    Article
    Avatar of taiTowards AI·2y

    Exploring Linear Regression for Spatial Analysis.

    Explore how linear regression can be used for spatial analysis and understand its benefits in machine learning and GIS.

  20. 20
    Article
    Avatar of freecodecampfreeCodeCamp·2y

    Data Analysis with Python – How I Analyzed My Empire State Building Run-Up Performance

    The post discusses the author's experience participating in the Empire State Building Run-Up and their subsequent analysis of their performance using data analysis with Python. The author describes the challenges in obtaining and analyzing race data, and highlights the use of open source tools to retrieve and analyze the data.

  21. 21
    Article
    Avatar of tigerdataTigerData (Creators of TimescaleDB)·2y

    PostgreSQL Data Cleaning vs. Python Data Cleaning

    This post explores how PostgreSQL and TimescaleDB can be used for efficient data cleaning tasks, replacing the need for tools like Excel, R, or Python. By cleaning data directly within the database, tasks can be performed more efficiently, saving time in the long run.

  22. 22
    Article
    Avatar of medium_jsMedium·2y

    Top DevTools to Build AI/ML Applications!

    Discover the top DevTools for building AI/ML applications, from programming languages like Wing to vector data storage with SingleStore, and data manipulation with Pandas and NumPy.

  23. 23
    Article
    Avatar of awstipAWS Tip·2y

    Youtube Data Analysis using AWS s3 + Glue crawler

    The post discusses how to analyze YouTube data using AWS S3 and Glue crawler. It explains the steps for creating an S3 bucket, downloading the dataset, refreshing data using Glue crawler, and viewing the data in Athena.

  24. 24
    Article
    Avatar of gopenaiGoPenAI·2y

    Customer Lifetime Value (CLV) Prediction With Machine Learning and DB Querying With LLM

    This post discusses the prediction of Customer Lifetime Value (CLV) for auto insurance clients using machine learning and database querying with LLM. It covers data extraction and cleaning, exploratory data analysis, machine learning model selection and optimization, user interface for CLV prediction, and a Q&A interface for data retrieval. The project aims to improve customer retention, enhance marketing effectiveness, facilitate data-driven decision-making, and provide a user-friendly experience.

  25. 25
    Article
    Avatar of medium_jsMedium·2y

    Starting with Kotlin Notebooks

    Learn how to start using Kotlin Notebooks for interactive development, explore their benefits, and how they can be useful for data analysis.