BST 272: Computing Environments for Biology

BST 272: Computing Environments for Biology

BST272: Computing Environments for Biology
Winter 2020, Jan. 7, 9, 14, 16
Time: TR 1:00-4:00
Location: Kresge 201 / LL6 (Jan. 14 only)

 

Instructor Information

Faculty

Curtis Huttenhower, Professor, Biostatistics and Immunology and Infectious Diseases
SPH1 413, chuttenh@hsph.harvard.edu, 617-432-4912
Office Hours: SPH1 413, 9th 11:30-12:30, 14th 4:00-5:00

Teaching Assistants

Sagun Maharjan, Software Developer, Biostatistics
SPH1 412, smaharjan@hsph.harvard.edu, 617-432-5065
Office hours: SPH1-2 atrium, 13th and 21st 1:00-2:00

Sharifa Sahai, Ph.D. Student, Systems Biology
sharifasahai@g.harvard.edu

 

Credits

1.25 credits

 

Course Description

This course provides a high-level introduction to general computing environments appropriate for biological data analysis, as preparation for more advanced computational biology and bioinformatics courses. It is intended for biologists, clinician-researchers, other bench or translational scientists, or mathematicians with little to no computational or applied quantitative experience. It provides a compressed, highly interactive, hands-on introduction to basic command line, Python, and R environments for biological data analysis and visualization. It covers basic quantitative methods that can be carried out for 'omics data analysis in these environments and ensures that students have access to local and online (i.e. grid, cloud) resources for using these tools in the future. Finally, it thoroughly introduces freely available documentation and strategies for self-learning when using computational methods for biology research.

 

Pre-Requisites

None

 

Learning Objectives

Upon successful completion of this course, you should be able to:

  • Recognize and carry out the essentials of navigation of command line environments for biological data handling.
  • Basic familiarity with the capabilities of general computing environments (Python, R) for quantitative biology and high-throughput molecular data analysis and visualization.
  • Comfort with terminology and concepts for basic quantitative data structures: vectors, lists, matrices, sets, permutations, probabilities and probability distributions, conditional probabilities, hypothesis tests, modeling, and prediction.
  • Ability to discover and understand documentation and resources available for further self-training as needed.

 

Course Structure

This is a small, highly hands-on course in which discussion is encouraged during lecture material and individual and small group participation is necessary during laboratory / tutorial sessions. Each session is organized into roughly 1/3 introduction of material via lecture and discussion, and 2/3 application of the material using interactive computing problem sets (individually and in small groups). Particularly due to the highly compact nature of the course schedule, students are expected to attend all sessions, be active participants in discussions and tutorials, and ask questions of the instructional staff and of each other to build comfort and understanding of quantitative computing environments for biology.

 

All course materials will be provided on the Canvas site, which will also be used to communicate scheduling, announcements, and to submit assignments. Students should bring laptops to class during all sessions if possible, and required software will be set up interactively during the relevant lab sessions.

 

Grading, Progress and Assessment

The final grade for this course will be based on:

  • In-class laboratory submissions (4x, 10% each, 40% total)
  • Homework problem sets (2x, 25% each, 50% total)
  • Participation, based on attendance and engagement in discussion and tutorial activities (10%)

 

In-class laboratory activities are generally carried out collaboratively in groups, with worksheets submitted individually at the end of each session. Homework problem sets are to be completed individually and submitted on time. The maximum score for late work will fall exponentially: 90% if one day late, 75% if two, 40% if three, and all credit lost if four or more days late. Extensions may be granted with reason if requested at least 24 hours in advance of the assignment deadline. Final letter grades will be curved based on the percentiles of total scores received by students in the class.

 

In-class laboratory submissions (4x, 10% each, 40% total)

Laboratory tutorial activities include Jupyter notebooks, knitr documents, and brief written question-and-answer or descriptions of online activities. The activities themselves are carried out during class, typically in small groups, with the written results submitted individually at the end of each session.

 

Homework problem sets (2x, 25% each, 50% total)

Problem sets will be assigned during the course to be completed individually outside of class. These are a combination of written questions with hands-on analysis activities. Late submissions without prior notice are penalized as described above.

 

Participation (10%)

In a small, intensively, highly collaborative course, participation and discussion is especially important. Class participation will be included in the final grade based on a combination of active participation during lectures (questions posed by students, and answers provided by students to those posed during lecture), attendance, and engagement with small groups and instructional staff during laboratory tutorials. Students should arrive to class prepared to ask and answer questions, share their viewpoints in constructive and respectful ways, and otherwise actively engage with other students and the course instructor. Notification of class absences should be provided by email to the instructional team at least 24 hours in advance. Otherwise, class participation will be graded as follows:

  • 10%: Always contributes to discussions by asking thoughtful questions; relates diverse topics from lecture sessions; is active and inquisitive during laboratory tutorials; discusses challenges with small group and instructors; attends all sessions.
  • 5-9%: Sometimes contributes to discussions as above; interaction during only a subset of topic areas; completes laboratory activities but is not engaged with collaborators; more than one lecture or lab absences without notice.
  • 0-4%: Rarely contributes to discussions or activities as above; more than one lecture absence without notice.

Course Summary:

Date Details
CC Attribution This course content is offered under a CC Attribution license. Content in this course can be considered under this license unless otherwise noted.