All assignments for the class will be listed here.
- There will be approximately 4-5 homework assignments, each with a number of programming problems.
- There may be additional sets of practice problems assigned to help reinforce various topics.
- A midterm “tutorial” assignment where you will write up a short tutorial on a data science subject.
- A final project, done in groups, on a data science problem of your choosing.
All assignments will be due at 11:59 pm ET (midnight) on the due date.
You are expected to know and adhere to the course policies, which govern late days, submissions, and collaboration.
Assignment dates
Due dates are tentative for any assignments that haven’t been released yet.
Assignment | Due date | Files (zipped tarball) | Colab version |
---|---|---|---|
Homework 1 | Feb 8 | hw1_get_started.tar.gz hw1_scraper.tar.gz hw1_xml_parser.tar.gz |
hw1_get_started hw1_scraper hw1_xml_parser |
Homework 2 | Feb 28 | hw2_relational_data.tar.gz hw2_time_series.tar.gz hw2_graph_library.tar.gz |
hw2_relational_data hw2_time_series hw2_graph_library |
Homework 3 | Mar 28 | hw3_linear.tar.gz hw3_text.tar.gz hw3_nlp.tar.gz |
hw3_linear hw3_text hw3_nlp |
Tutorial | Mar 16 (Proposal) Apr 6 (Submission) Apr 13 (Peer evaluation) |
||
Homework 4 | Apr 25 | hw4_bayes.tar.gz hw4_unsupervised.tar.gz hw4_cf.tar.gz |
hw4_bayes hw4_unsupervised hw4_cf |
Project | Apr 15 (Proposal) May 4 (Video) May 9 (Report) |
Homework
Homeworks are distributed Jupyter notebooks (we will also link Colab notebooks shortly), and are submitted for grading using code in the notebook as well (we will post a description of this proceess along with the first homework). To submit the assignments, sign up for an account (with your andrew email) on the autograding site https://mugrade.datasciencecourse.org
Tutorial
In lieu of a midterm exam, students will write a tutorial on a data science topic of their choosing. More information may be found here.
Again, no late days are permitted on the tutorial, and failure to submit by the deadline will result in zero points for the proposal component.
Final project
The final project of the course will consist of a large data science project done in teams of 2-3 people (single person or four person teams will be considered on an individual basis). The final report for this project will be a Jupyter notebook detailing the data collection, analysis, and results. In addition to the report, teams will also prepare a short video for showing during a final project video session.