STAT 188: Variations, Information and Privacy

Course Description

This course delves into the intriguing realms of variations, information, and privacy, with a keen focus on both their qualitative conceptualizations, such as contextual integrity, and their quantitative specifications, exemplified by differential privacy. Our primary goal is to examine these concepts through a foundational statistical lens, and study statistics from the dual perspectives of creating and limiting information from data. At the heart of our exploration is the concept of variations, serving as a unifying theme that intricately links information (revelatory variations) with uncertainty (obfuscatory variations). This nuanced approach enables us to recognize that the principles governing how we restrict the flow of information mirror those involved in generating information (the traditional focus of statistics).

A considerable portion of the course will focus on an in-depth study of differential privacy. First, we willdissect its mathematical framework through theory and examples, identify five key elements that define a general DP specification, and understand what it guarantees -- and what it does not. Second, we will delve into the intricacies of implementing DP via the case of the 2020 U.S. Census and the social and legal perspectives on privacy this raises. Third, we will learn about how to apply missing data methodologies to properly analyze differentially privatized data. Throughout the course, we will confront the challenge of meaningfully defining and quantifying individual privacy and information.

By the conclusion of this course, students are expected to have developed a deeper appreciation for the complex interplay among variations, information, and privacy. They will be equipped with foundational analytical tools and statistical insights, empowering them to navigate the theoretical and practical challenges associated with revealing and concealing information in data for statistical inference and learning.

 

Course Format

The course meets weekly on Wednesdays from 3:00-5:45 with a 10 minute break in the middle. It is lecture style but questions and discussions are encouraged.

 

Typical enrollees

This course is intended as an advanced level undergraduate course for students in statistics, computer science, computational sociology, digital humanities, science and technology studies, and similar data-adjacent fields. Graduate students are also welcome. There is a cap on enrollment to ensure effective classroom discussion. Enrolled students are expected to have foundational and theoretical proficiency at the level of STAT110 and STAT111. Some interest in philosophical and legal thinking  is desirable. For students who wish to enroll but do not have the full STAT prerequisites, they should submit (via email) a maximum 400-word explanation for why they are a good fit for the course, or how they will contribute to the course without the STAT prerequisites. A justified motivation to study either statistical foundations, privacy or inter-disciplinary concepts more broadly can be a compensating factor for lacking the full prerequisites (e.g., having only taken one of STAT110 or STAT111).

 

 

When is course typically offered?

Fall

 

What can students expect from you as an instructor?

We encourage active classroom discussion and participation, deep thinking and interdisciplinary connections. The course schedule is semi-flexible. If an interesting idea or connection arises, we will take time to explore it!

 

Assignments and grading:

There are three components to your grade: 

1. Monthly homeworks (3 total) worth at most 50% of the grade

2. Active classroom participation and discussion will be worth at least 10% of the grade

3. A midterm and final which will take the form of projects/essays and presentations worth at most 50% of the grade

 

Enrollment cap, selection process, notification:

20 students. Please submit a petition on my.harvard.edu noting that you have taken the prerequisites or if not, the 400 word max statement described above.

Absence and late work policies:

Course participation is an important part of the course and grade so it is important to attend regularly. Students should email the course TA if they will be absent.

Late work: students have 24 cumulative hours of late time across all written assignments. Additional lateness requires approval and documentation (e.g., medical absence).

 

 

Course Summary:

Date Details Due