Course Page - IIT Madras Degree Program

Applications Open now for May 2024 Batch | Applications Close: May 26, 2024 | Exam: Jul 07, 2024

Degree Level Course

Introduction to Natural Language Processing (i-NLP)

Natural language (NL) refers to the language spoken/written by humans. NL is the primary mode of communication for humans. With the growth of the world wide web, data in the form of text has grown exponentially. It calls for the development of algorithms and techniques for processing natural language for the automation and development of intelligent machines: Natural Language Processing (NLP). On the completing the course, the participant will learn the following: 1. Why is processing language computationally hard and why specialized techniques need to be developed to process texts? 2. Knowledge and in-depth understanding of linguistics techniques and classical (statistical) approaches (pre-deep learning era) to NLP and their limitations. 3. Knowledge and in-depth understanding of deep learning approaches (RNN and CNN) to NLP. 4. Knowledge and in-depth understanding of Attention Mechanism, Transformers and Large Language Models (LLMs) 5. Ability to read and understand latest NLP-related research papers. 6. Ability to identify applicable NLP technique to solve a real-world problem involving text processing. 7. Ability to implement NLP models and algorithms for problems related to text processing. 8. Ability to develop applications based on textual generative models (LLMs - Large Language Models)

by Ashutosh Modi

Course ID: BSCS5002

Course Credits:

Course Type:

Pre-requisites: None

	Introduction to Natural Language (NL) Why is it hard to process natural language? Levels of Language Processing Linguistic Fundamentals for NLP
	NLP Pipeline: Tokenization, lemmatization, normalization, POS, Parsing, etc. Sub-tokenization Text Prediction: Introduction, Framework, and its components Evaluation
	Feed Forward Neural Networks for NLP, Regularization, Dropout Computational Graphs and Backpropagation Word Representation: Distributed Representations Language Models: n-gram and Neural Word2Vec, GloVe
	CNNs for NLP Neural Sequence Models Contextualized Word Embeddings Attention Mechanism Assessment: Hands on assignment
	Self-attention Mechanism Transformers Pretrained Language Models (PLMs): BERT, GPT, etc. Fine tuning and transfer learning
	Large Language Models (LLMs) Parameter Efficient Fine Tuning: Prefix-coding, LORA, etc. Emergent Behavior: In-context learning, Instruction Tuning RLHF
	Naïve Bayes Classifier Expectation Maximization Algorithm Logistic Regression Maximum Entropy Models
	Classical Sequence Models: HMMs, MEMMS, CRF, RNN-CRF
	Information Extraction Automatic Speech Recognition Machine Translation
	Information Extraction Automatic Speech Recognition Machine Translation
	Coreference Resolution Discourse Parsing
	Distributional Semantics Logical Semantics: Representation and Semantic Parsing Predicate Argument Semantics: Semantic Role Labeling and Frame Semantics