Ethics in AI (DS 307)

Term: Aug - Dec’ 24

Time: Tuesdays & Thursdays (11:30 - 13:00)

Venue: CDS 419

Credits: 3:0

Course Details

Outline: We interact with AI technology on a daily basis—such systems answer the questions we ask (using Google, or other search engines), curate the content we read, unlock our phones, allow entry to airports, etc. Further, with the recent advances in large language and vision models, the impact of such technology on our lives is only expected to grow. This course introduces students to ethical implications associated with design, development and deployment of AI technology spanning NLP, Vision and Speech applications.

This is a seminar-style course, wherein each class would be a discussion based on the readings assigned for that particular day. Each class would begin with a short quiz, which would be straightforward if you have read through the required section of the reading material. The in-class discussion among students would be facilitated by the instructor who would bring in discussion points based on the reading material.

Prerequisites: The class is intended for graduate students and senior undergraduates. Students should have finished at least a basic machine learning course (from IISc), and any one IISc course related to the discussed applications (computer vision, speech or NLP).

Content: Specifically, this seminar course would facilitate discussions among students structured around pre-selected readings on topics related to ethics in AI. Specifically, we plan to read about and discuss following modules:

M1. Overview of ethical theories
M2. Data collection and curation
M3. Biases and algorithmic fairness; debiasing and mitigating harms
M4. Privacy
M5. Content Moderation
M6. Misinformation, disinformation and hate-speech
M7. Algorithm audits and transparency
M8. Environmental impact of model training and inference
M9. Future of work; economic impact of AI

Intentionally, many of the modules are connected to each other. We hope to spend at least 2-3 classes discussing each module.

Schedule:

A tentative schedule for the course is below.

Date	Topic	Reading Material
Aug 6	Course Overview	–
Aug 8	(M1) Major Ethical Theories	Required: (a) An overview of ethical theories
Aug 20	(M1) Values in NLP/ML research	Required: (a)The Social Impact of NLP; and (b) The Values Encoded in ML Research (pages 1-15)
Aug 22	(M2) Data Cascades	Required: “Everyone wants to do the model work, not the data work"
Aug 29	(M2) Data documentation	Required: Datasheets for datasets (pages 1-10). Recommended: Data Statements for NLP
Sep 3	(M2) Data collection	Required: IndicVoices: Towards building an Inclusive Multilingual Speech Dataset for Indian Languages. Recommended: IndicVoices Blog, Ethical Data Pledge, OpenAI Used Kenyan Workers on Less Than $2 Per Hour to Make ChatGPT Less Toxic
Sep 5	(M3) Algorithmic Bias	Required: Moving beyond ‘‘algorithmic bias is a data problem’’. Recommended: Facial Recognition Is Accurate, if You’re a White Guy, The Woman Worked as a Babysitter: On Biases in Language Generation
Sep 10	(M3) Algorithmic Bias	Required: Language (Technology) is Power: A Critical Survey of “Bias” in NLP (pages 1-9). Recommended: Gender Bias in Coreference Resolution
Sep 12	(M3) Debiasing and mitigation	Required: Mitigating Gender Bias in NLP. Recommended: Getting Gender Right in Neural Machine Translation
Sep 17	(M4) Privacy	Required: What Does it Mean for a Language Model to Preserve Privacy? Recommended: Differential Privacy: A Primer
Sep 19	Project Discussions	–
Oct 1	(M4) Privacy	Required: Extracting Training Data from Large Language Models, What does GPT-3 “know” about me?
Oct 3	(M5) Content Moderation	Required: When Curation Becomes Creation: Algorithms, Microcontent, and the Vanishing Distinction between Platforms and Creators.
Oct 8	(M5) Content Moderation	Required: Content moderation, AI, and the question of scale. Recommended: Do Not Recommend? Reduction as a Form of Content Moderation
Oct 15	(M5) Content Moderation	Required: Decolonizing Content Moderation. Recommended: AI Content Moderation, Racism and (de)Coloniality
Oct 17	Peer feedback on project proposal	–
Oct 22	(M6) Misinformation, disinformation	Required: Images and misinformation in political groups: Evidence from WhatsApp in India and Tiplines to Combat Misinformation on WhatsApp. Recommended: Can WhatsApp Benefit from Debunked Fact-Checked Stories to Reduce Misinformation?
Oct 24	(M6) Misinformation, disinformation	Required: A survey on automated fact-checking. Recommended: How to search for fact-checked information
Oct 29	(M7) Algorithm Audits	Required: Auditing Algorithms Understanding Algorithmic Systems from the Outside: Chapter 2, 3 (also recommend reading 1 and 4).
Oct 31	No class (Diwali break)	–
Nov 5	(M7) Algorithm Audits	Required: Unequal Representation and Gender Stereotypes in Image Search Results for Occupations. Recommended: An Image of Society: Gender and Racial Representation and Impact in Image Search Results for Occupations, Text-to-Image Generation Amplifies Demographic Stereotypes at Large Scale
Nov 8	(M8) Environmental Impact	Required: Environmental Section of Foundations Models Paper (pages 140 - 145) and Power Hungry Processing: Watts Driving the Cost of AI Deployment?. Recommended: Estimating the Carbon Footprint of BLOOM,a 176B Parameter Language Model, Systematic Reporting of the Energy and Carbon Footprints of Machine Learning, Green AI: 1, 2,
Nov 14	(M9) Future of work	Required: Thousands of AI Authors on the Future of AI

Course Evaluation

The evaluation comprises 3 components:

Class projects [35%]: we will shortly share more details about the course projects
Readings & discussion [35%]: A small group would be responsible for scribing one module (constituting 20%), the in-class quizzes account for 10% of the grade (we would drop the lowest two scores). A small 5% of the grade is reserved for in-class participation.
Final exam [30%].

Scribes

Each course attendee would be responsible for scribing one module of the course. A high-quality scribe (corresponding to A, A- score) would comprehensively discuss the class interactions in relation with the required and recommended reading material. Less comprehensive and less thoughtful scribes would warrant lower scores (for instance, summarizing just the class discussions or the reading material but not both).

Class Projects

The course project constitutes 35% of the overall score, where students (with team sizes of no more than two) get a chance to apply the acquired knowledge. So long as the broad topic of the project is relevant to the course, projects could take the diverse forms, including case studies, qualitative research projects, model and/or data audits, algorithmic inquiries and solutions, etc. The project includes two milestones: (1) initial proposal, which clearly states the problem, discusses relevant literature and outlines a rough action plan; and (2) a final report. Towards the end of the course, students would get a chance to showcase their research through presentations.

Each team would get two late days for projects, no extensions will be offered (please don’t even ask). After your late days expire, you can still submit your project but your obtained score would be divided by 2 if submitting after 1 day, and will be divided by 4 if submitting after 2 days. No submissions would be entertained after that.

Important Dates:

Sept 30, 16:59 Project proposals due
Nov 21, 16:59 Final reports due

Discussions & (Anonymous) Feedback

We will use Teams for all discussions on course-related matters. Registered students will receive the joining link/passkey.

If you have any feedback, you can share it (anonymously or otherwise) through this link: http://tinyurl.com/feedback-for-danish

Teaching Staff

Kinshuk Vasisht (Teaching Assistant)
Danish Pruthi (Instructor)

Acknowledgements

We are grateful to Yulia Tsvetkov for readily sharing the content for her ethics classes at CMU and UW, and to Navreet Kaur for helping with the initial outline of this course.

Danish