Onnela Lab
Our research focuses on network science and digital phenotyping.
Department of Biostatistics
655 Huntington Avenue
Building II, 4th Floor
Boston, MA 02115
Using Python for Research Videos
These are the 90 videos for our HarvardX course Using Python for Research. If you enjoy these videos and want to learn more, you might consider taking the course itself, which contains many comprehension checks and coding exercises. For version 2 of the course, we updated the comprehension checks and some of the exercises, and we also added a new module on statistical learning (Week 5). Version 2 of the coursed was launched in January 2018, and the new videos released then for Week 5 are also available below. The most current version of the HarvardX course is version 4, which was launched in September 2019. The latest version of the course includes a final project that examines physical activity recognition from smartphone accelerometer data.
Week 0:
0.0: Course Trailer
0.1: Why Program? Why Python?
Week 1:
1.1.1: Python Basics
1.1.2: Objects
1.1.3: Modules and Methods
1.1.4: Numbers and Basic Calculations
1.1.5: Random Choice
1.1.6: Expressions and Booleans
1.2.1: Sequences
1.2.2: Lists
1.2.3: Tuples
1.2.4: Ranges
1.2.5: Strings
1.2.6: Sets
1.2.7: Dictionaries
1.3.1: Dynamic Typing
1.3.2: Copies
1.3.3: Statements
1.3.4: For and While Loops
1.3.5: List Comprehensions
1.3.6: Reading and Writing Files
1.3.7: Introduction to Functions
1.3.8: Writing Simple Functions
1.3.9: Common Mistakes and Errors
Week 2:
2.1.1: Scope Rules
2.1.2: Classes and Object-Oriented Programming
2.2.1: Introduction to NumPy Arrays
2.2.2: Slicing NumPy Arrays
2.2.3: Indexing NumPy Arrays
2.2.4: Building and Examining NumPy Arrays
2.3.1: Introduction to Matplotlib and Pyplot
2.3.2: Customizing Your Plots
2.3.3: Plotting Using Logarithmic Axes
2.3.4: Generating Histograms
2.4.1: Simulating Randomness
2.4.2: Examples Involving Randomness
2.4.3: Using the NumPy Random Module
2.4.4: Measuring Time
2.4.5: Random Walks
Week 3:
3.1.1: Introduction to DNA Translation
3.1.2: Downloading DNA Data
3.1.3: Importing DNA Data Into Python
3.1.4: Translating the DNA Sequence
3.1.5: Comparing Your Translation
3.2.1: Introduction to Language Processing
3.2.2: Counting Words
3.2.3: Reading in a Book
3.2.4: Computing Word Frequency Statistics
3.2.5: Reading Multiple Files
3.2.6: Plotting Book Statistics
3.3.1: Introduction to kNN Classification
3.3.2: Finding the Distance Between Two Points
3.3.3: Majority Vote
3.3.4: Finding Nearest Neighbors
3.3.5: Generating Synthetic Data
3.3.6: Making a Prediction Grid
3.3.7: Plotting the Prediction Grid
3.3.8: Applying the kNN Method
Week 4:
4.1.1: Getting Started With Pandas
4.1.2: Loading and Inspecting Data
4.1.3: Exploring Correlations
4.1.4: Clustering Whiskies by Flavor Profile
4.1.5: Comparing Correlation Matrices
4.2.1: Introduction to GPS Tracking of Birds
4.2.2: Simple Data Visualizations
4.2.3: Examining Flight Speed
4.2.4: Using Datetime
4.2.5: Calculating Daily Mean Speed
4.2.6: Using the Cartopy Library
4.3.1: Introduction to Network Analysis
4.3.2: Basics of NetworkX
4.3.3: Graph Visualization
4.3.4: Random Graphs
4.3.5: Plotting the Degree Distribution
4.3.6: Descriptive Statistics of Empirical Social Networks
4.3.7: Finding the Largest Connected Component
Week 5:
5.1.1: Introduction to statistical learning
5.1.2: Generating example regression data
5.1.3: Simple linear regression
5.1.4: Least squares estimation in code
5.1.5: Simple linear regression in code
5.1.6: Multiple linear regression
5.1.7: Scikit learn for Linear Regression
5.1.8: Assessing Model Accuracy
5.2.1: Generating Example Classification Data
5.2.2: Logistic Regression
5.2.3: Logistic Regression in Code
5.2.4: Computing Predictive Probability Across the Grid
5.3.1: Tree-Based Methods for Regression and Classification
5.3.2: Random Forest Predictions