Finding good data for data science projects can be challenging. This post discusses what makes data 'good,' including relevance, consistency, and timeliness. It contrasts structured and unstructured data, and explains common data formats like CSV and databases. The post also lists resources to find datasets, such as the UCI Machine Learning Repository, Kaggle, and Hugging Face. It highlights the importance of starting with structured data and provides guidance on the next steps after choosing a dataset.
Table of contents
What is “good data” for data science projects?Do you want structured or unstructured data?What are standard data formats?Where can I find datasets for my data science projects?What are the next steps?2 Comments
Sort: