Course Syllabus
SyllabusCS S109AIntroduction to Data ScienceSyllabus ā Summer 2018Pavlos Protopapas and Kevin Rader Lectures: Northwest Science Building B108. Mondays & Wednesdays 12:00 PM - 03:00 PM. Labs: Northwest Science Building B108. Fridays 12:00 PM - 03:00 PM. Welcome to S109A, Introduction to Data Science. This course is the first half of a one-year introduction to data science. The course focuses on the analysis of messy, real life data to perform predictions using statistical and machine learning methods. The material of the course is divided 3 modules. Each module will integrate the five key facets of an investigation using data:
Students who have previously taken CS 109, AC 209, or Stat 121 cannot take CS S109A for credit. Course LogisticsPrerequisitesYou are expected to have programming experience at the level of CS 50 or above, and statistics knowledge at the level of Stat 100 or above (Stat 110 recommended). HW0 is designed to test your knowledge on the prerequisites. Successful completion of this assignment will show that this course is suitable for you. HW0 will not be graded but you are required to submit. Course ComponentsLecturesThe class consists of two weekly lectures and one lab, which is designed as a class activity. They are held Mon and Wed 12-3pm in Northwest Science Building B108, live-stream feed and taped version will also be available (videotaped will be available within 24 hours) . We will have quizzes after each lecture is released online to assess and challenge your understanding of the material and to help us identify gaps. LabsAttendance to labs is optional but strongly encouraged. Labs are designed as hands-on in-class activities. The instructor will go over practice problems similar to the homework problems and review difficult material. Labs will be held on Fri 12-3pm in Northwest Science Building B108. Office HoursOn-campus OH will be at the Lobby of the IACS in Maxwell Dworkin, 33 Oxford Street, unless otherwise stated below. Online OH will be via Zoom at: https://harvard-dce.zoom.us/j/7607382317 Pavlos: Mondays 4:30-6:00pm MD G-109 [on-campus and online]. Kevin: Mondays 3-4:30pm (after class) [on-campus and online]. Patrick: Mondays 6-7pm [on campus and online]. Brandon: Mondays 8-9pm [online]. Richard: Tuesdays 5-6pm [on campus and online]. Sol: Tuesdays 6-7pm [online]. Nick: Thursdays 9-10am [on campus and online]. Evan: Thursdays 10-11am [online]. David: Fridays 3-4pm MD G-111 (after lab) [on campus and online]. Joe: Sundays 10-11am [on-campus and online]. Will: Sundays noon-1pm [on-campus and online]. AssignmentsThere will be an initial self-assessment homework called HW0 and 6 more graded weekly homework assignments. You will be working in Jupyter Notebooks which you can run on your own computer. HW0 will be published on June 15. QuizzesQuizzes will be taken at the end of class and the material will be based on what was discussed in lecture. 40% of the quizzes will be dropped from your grade. Final ProjectThere will be a final group project (2-4 students) due Thurs, Aug 9. Look at Project Guidelines for more details. RecordingLectures and labs will be live-streamed, and will be recorded and made available 24 hours later via Canvas.
Recommended TextbookAn Introduction to Statistical Learning by James, Witten, Hastie, Tibshirani. The book is available here. There will be assigned readings from the text leading up to each lecture: Free electronic version: http://www-bcf.usc.edu/~gareth/ISL/ (Links to an external site). HOLLIS: http://link.springer.com.ezp-prod1.hul.harvard.edu/book/10.1007%2F978-1-4614-7138-7 Course PoliciesGetting HelpFor questions about homework, course content, package installation, JupyterHub, and after you have tried to troubleshoot yourselves, the process to get help is:
Questions on Graded Homework and Regrading PolicyWe take great care in making sure all homework are graded properly. However if you feel that your assignment was not fairly graded you may:
Late Day PolicyYou are allowed up to 3 days of late homework submissions, maximum of 1 day on any single assignment, no questions asked. No homework will be submitted more than 24 hours late. Solutions will be posted one day after the due date. Late homework submissions will not be accepted after 24 hours past the due date. If you exceed your 3 late days, 1 point (20%) will be deducted for late days after that. Late minutes count as a whole day, e.g. if you submit 30 minutes late, this will count as a 1 day. Communication from Staff to StudentsClass announcements and official communication from staff will be through Canvas. All homework and quizzes will be posted and submitted in Canvas. MAKE SURE you have your settings set so you can receive emails from Canvas. No official communication or announcements will be done via Piazza. Submitting an assignmentYou are to work all homework in a Jupyter Notebook. When you are done, convert your notebook in a pdf and submit both the .ipynb file and the .pdf file. You can submit multiple times up to the deadline. You are encouraged but not required to submit in pairs. We will be using the Groups function in Canvas to do this, details to be announced later. One assignment will be completed individually without any collaboration with peers. All assignments will due on Tuesdays at 11:59pm in Canvas and will be posted one week in advance. Collaboration PolicyWe encourage you to talk and discuss the assignments with your fellow students (and on Piazza), but you are not allowed to look at any other students assignment or code outside of your pair. Discussion is encouraged, copying is not allowed. Grading GuidelinesHomework will be graded based on 1) how correct your code is (the Notebook cells should run, we are not troubleshooting code), 2) how you have interpreted the results - we want text not just code, it should be a report, and 3) how well you present the results. The scale is 1-5. For more details, check out The CS109A Grade SoftwareWe will be using Jupyter Notebooks, Python 3 and various python modules. You can access the notebook viewer either in your own machine by installing the Anaconda platform (Links to an external site) which includes Jupyter/IPython as well all packages that will be required for the course, or by using the SEAS Jupyter Hub from Canvas. Details in class. Grading ScoreYour final score for the course will be computed using the following weights: Paired Homeworks 40% Individual Homework 20% Quizzes 15% Project 25% Total 100%
|
Student Support Tips |
Instructor Support Tips |
Course Summary:
Date | Details | Due |
---|---|---|