Courses & TutorialsProgrammingSoftware

Top 10 Free Data Science courses from Harvard

Spread the love

1. Principles, Statistical and Computational Tools for Reproducible Science

Start Date — April 17th, 2020

Difficulty level — Intermediate

Duration — 8 weeks long

You’ll learn (source: Course syllabus) —

  • Learn the fundamentals of reproducible science and understand why reproducible research matters, definitions, and concepts and factors affecting reproducibility Module
  • Key elements required for data provenance and reproducible experimental design
  • Statistical methods for reproducible data analysis
  • Participants will participate in six modules that will include several case studies that illustrate the significant impact of reproducible research methods on scientific discovery.
  • Computational Tools for Reproducible Science using R and Rstudio, Python
  • Computational tools for reproducible data analysis and version control (Git/GitHub, Emacs/RStudio/Spyder), reproducible data (Data repositories/Dataverse) and reproducible dynamic report generation (Rmarkdown/R Notebook/Jupyter/Pandoc), and workflows.

Taught By —

Curtis Huttenhower, Associate Professor of Computational Biology and Bioinformatics, Harvard University

John Quackenbush, Professor of Computational Biology and Bioinformatics, Harvard University

Lorenzo Trippa, Associate Professor of Biostatistics, Harvard University

Christine Choirat, Research Associate, Harvard University

separator 1

2. Data Science: Linear Regression

Start Date — Jan 28th, 2020

Difficulty level — Beginner

Duration — 8 weeks long

You’ll learn (source: Course syllabus) —

  • How Galton originally developed the linear regression
  • Basics of confounding and detection techniques
  • Basics of R
  • Learn how to examine the relationships between variables by implementing linear regression in R

Taught By —

Rafael Irizarry, Professor of Biostatistics, Harvard University

separator 1

3. Data Science: Machine Learning

Start Date — Jan 28th, 2020

Difficulty level — Beginner

Duration — 8 weeks long

You’ll learn (source: Course syllabus) —

  • Learn the basics of machine learning
  • How to perform cross-validation to avoid overtraining
  • Popular machine-learning algorithms
  • Basics of regularization
  • Learn how to build a recommendation system from scratch

Taught By —

Rafael Irizarry, Professor of Biostatistics, Harvard University

separator 1

4. Data Science: Visualization

Start Date — Jan 28th, 2020

Difficulty level — Beginner

Duration — 8 weeks long

You’ll learn (source: Course syllabus) —

  • Learn the basics of Data visualization principles and how to apply them using ggplot2.
  • Communicate data-driven findings, motivate analyses, and detect flaws
  • You will learn how to leverage data to reveal valuable insights and advance your career

Taught By —

Rafael Irizarry, Professor of Biostatistics, Harvard University

separator 1

5. Data Science: Probability

Start Date — Jan 28th, 2020

Difficulty level — Beginner

Duration — 8 weeks long

You’ll learn (source: Course syllabus) —

  • Learn the important concepts in probability theory including random variables and independence and how to Monte Carlo simulation
  • The meaning of expected values, standard errors and how to compute them in R
  • The basics and importance of the Central Limit Theorem

Taught By —

Rafael Irizarry, Professor of Biostatistics, Harvard University

separator 1

6. Data Science: Inference and Modeling

Start Date — Jan 28th, 2020

Difficulty level — Beginner

Duration — 8 weeks long

You’ll learn (source: Course syllabus) —

  • Important concepts, necessary to define estimates and margins of errors of populations, parameters, estimates, and standard errors and learn how you can use these to make predictions relatively well and also provide an estimate of the precision of your forecast.
  • How to use models to aggregate data
  • Basics of Bayesian statistics and predictive modeling

Taught By —

Rafael Irizarry, Professor of Biostatistics, Harvard University

separator 1

7. Data Science: R Basics

Start Date — Jan 28th, 2020

Difficulty level — Beginner

Duration — 8 weeks long

You’ll learn (source: Course syllabus) —

  • Build a foundation in R and learn how to wrangle, analyze, and visualize data.
  • Foundational concepts like data types, vectors arithmetic, and indexing — R programing
  • Operations using R like sorting, data wrangling using dplyr, and making plots

Taught By —

Rafael Irizarry, Professor of Biostatistics, Harvard University

separator 1

8. Introduction to Linear Models and Matrix Algebra

Start Date — April 17th, 2020

Difficulty level — Intermediate

Duration — 4weeks long

You’ll learn (source: Course syllabus) —

  • Basics of matrix algebra including notations and operations
  • Learn the application of matrix algebra to data analysis
  • How to build and work with Linear models
  • Learn about QR decomposition

Taught By —

Rafael Irizarry, Professor of Biostatistics, Harvard University

Michael Love, Assistant Professor, Departments of Biostatistics and Genetics, UNC Gillings School of Global Public Health

separator 1

9. Statistics and R

Start Date — April 17th, 2020

Difficulty level — Intermediate

Duration — 4weeks long

You’ll learn (source: Course syllabus) —

  • Learn by examples that will help you make the connection between concepts and implementation
  • Learn in-depth about Random variables, Distributions, Inference: p-values and confidence intervals, Non-parametric statistics
  • Learn how to do Exploratory Data Analysis using R
  • Learn how to use R scripts to analyze data and the basics of reproducible research.

Taught By —

Rafael Irizarry, Professor of Biostatistics, Harvard University

Michael Love, Assistant Professor, Departments of Biostatistics and Genetics, UNC Gillings School of Global Public Health

separator 1

10. High-Dimensional Data Analysis

Start Date — April 17th, 2020

Difficulty level — Intermediate

Duration — 4weeks long

You’ll learn (source: Course syllabus) —

  • Learn the mathematical definition of distance and use of the singular value decomposition (SVD) for dimension reduction of high-dimensional data sets, and multi-dimensional scaling and its connection to principal component analysis.
  • Learn the basics of Machine Learning
  • Learn the basics of Factor Analysis and how to deal with Batch Effects
  • Learn how to implement Clustering and Heatmaps

Taught By —

Rafael Irizarry, Professor of Biostatistics, Harvard University

Michael Love, Assistant Professor, Departments of Biostatistics and Genetics, UNC Gillings School of Global Public Health

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button