Term: January 2024 – May 2024
Time: Tuesdays & Thursdays (10-11:30)
Venue: CDS 102
Credits: 3:1
Outline: This course is a graduate-level introduction to the field of Natural Language Processing (NLP), which involves building computational systems to handle human languages. We interact with NLP systems on a daily basis—such systems answer the questions we ask (using Google, or other search engines), curate the content we read, autocomplete words we are likely to type, translate text from languages we don’t know, flag content on social media that we might find harmful, etc. Such systems are prominently used in industry as well as academia, especially for analyzing textual data.
Prerequisites: The class is intended for graduate students and senior undergraduates. We do not plan to impose any strict requisites on IISc courses that one should have completed to register for this course. However, students are expected to know the basics of linear algebra, probability, calculus, and neural networks. Programming assignments would require proficiency in Python.
Feb 27, 2024: Assignment #3 is out now, due on Mar 22, 16:59 IST.
Feb 11, 2024: Included the template for the project.
Feb 6, 2024: A few broad project directions are here.
Feb 6, 2024: Assignment #2 is out now, due on Feb 23, 16:59 IST.
Jan 22, 2024: Assignment #1 is out now, due on Feb 6, 16:59 IST.
Jan 20, 2024: The class on Thursday (Jan 25) will happen at EE B-308.
Jan 9, 2024: The quiz to assess the pre-requisites would be held in CDS 102 CDS 202 (and if required CDS rooms 419 and 208) during the class hours on Thursday (Jan 11). The quiz will be conducted through google forms, so don’t forget to carry your laptop or phone.
The course schedule is as follows. This is subject to changes based on student feedback and pace of the instruction.
The evaluation comprises 3 programming assignments (3 x 15% = 45% of the overall score), 2 exams (2 x 10% of the overall grade), and final group course project (which is worth 35% of the overall grade).
The two exams aim to evaluate the student’s learning acquired through lectures and assignments. One of these two exams would be administered towards the middle of the semester and the second one towards the end. Each exam is worth 10% of the grade.
The course project constitutes 35% of the overall score, where students—in groups of three—get a chance to apply the acquired knowledge for an application of their choice. Projects would typically involve human languages and deep learning. The project includes three milestones: (1) initial proposal (which will require a rough action plan and associated timelines); (2) a mid-term report and (3) a final report. Towards the end of the course, students would get a chance to showcase their research through poster presentations.
Each team would get three late days for projects, no extensions will be offered (please don’t even ask). After your late days expire, you can still submit your project but your obtained score would be divided by 2 if submitting after 1 day, and will be divided by 4 if submitting after 2 days. No submissions would be entertained after that.
Some project directions are availabe here. Please note that these directions are only suggestive, students should not limit their explorations to just these directions.
New: Included the template for the project that also includes a few guidelines.
Important dates:
The three programming assignments will involve building systems for (1) text classification and learning word representations; (2) language modeling; (3) TBD (possibly machine translation and/or named entity recognition). The assignments will be implemented using interactive Python notebooks intended to run on Google’s Colab infrastructure. This allows students to use GPUs for free and with minimal setup. The notebooks will contain instructions interleaved with code blocks for students to fill in.
These assignments are meant to be solved individually. For a total of three assignments, you would get three late days, no extensions will be offered (please don’t even ask). There are no restrictions on how the late days can be used, for example, you can use all the three late days for one assignment. If you run out of late days, you can still submit your assignment, but your obtained score would be divided by 2 if submitting after 1 day, and by 4 if submitting after 2 days. No submissions would be entertained after that.
Important dates:
We will use Teams for all discussions on course-related matters. Registered students should have received the joining link/passkey.
If you have any feedback, you can share it (anonymously or otherwise) through this link: http://tinyurl.com/feedback-for-danish