Best Public Datasets for Public Health Data Science Projects


I will keep updating this page. If you have a suggestion on a dataset to add, send me a message on Twitter @ahobby9

Dataset Search:

Health Data This is a search engine for healthcare datasets available through the federal government.

Kaggle Datasets Kaggle has tons of machine learning friendly datasets. It also has a search feature where you can find healthcare datasets.

UCI Datasets UCI has a popular heart disease and breast cancer machine learning dataset. However, there are other datasets if you search.

Google Dataset Search You can find healthcare datasets here among other datasets.

Amazon Dataset Search Amazon has a list of various kinds of datasets including healthcare.

General Datasets

WHO Dataset These are datasets made available by the World Health Organization. This is a great resources for those interested in global health. 

Wikipedia This has a list of datasets some are healthcare 

Broad Institute This has genetic data. 

1000 Genomes  This has genetic data. 

SEER Cancer  This has cancer data for the United States. 

Imaging Datasets:

Openfmri This is an MRI imaging dataset. 

Oasis Brains Dataset This is a brain imaging dataset. 

NIH DeepLesion This is an imaging dataset from the NIH. 

Cancer Imaging This has cancer imaging data.