Applications Open now for May 2024 Batch | Applications Close: May 26, 2024 | Exam: Jul 07, 2024

Applications Open now for May 2024 Batch | Applications Close: May 26, 2024 | Exam: Jul 07, 2024

Degree Level Course

Deep Learning for Computer Vision

-Knowledge of basics of image processing and computer vision -Knowledge of building blocks of deep learning including feedforward networks, convolutional neural networks, recurrent neural networks and transformers -Knowledge of generative AI models in computer vision -Knowledge of recent trends including explainability/zero-shot learning, few-shot learning, self-supervised learning, etc -Hands-on experience on implementation of basic image processing tasks -Hands-on experience on implementation of deep learning models for computer vision tasks -Hands-on experience on implementation of advanced computer vision tasks such as explainability, self-supervised learning, etc

by Vineeth N B

Course ID: BSCS5003

Course Credits:

Course Type:

Pre-requisites: None

Course structure & Assessments

For details of standard course structure and assessments, visit Academics page.

Introduction and Overview: Course Overview and Motivation; Introduction to Image Formation, Capture and Representation; Linear Filtering, Correlation, Convolution
Visual Features and Representations: Edge, Blobs, Corner Detection; Scale Space and Scale Selection; SIFT, SURF; HoG,LBP, etc.
Visual Matching: Bag-of-words, VLAD; RANSAC, Hough transform; Pyramid Matching; Optical Flow
Deep Learning Review: Review of Deep Learning, Multi-layer Perceptrons, Backpropagation
Convolutional Neural Networks (CNNs): Introduction to CNNs; Evolution of CNN Architectures: AlexNet, ZFNet, VGG, InceptionNets, ResNets, DenseNets
Visualization and Understanding CNNs: Visualization of Kernels; Backprop-to-image/Deconvolution Methods; Deep Dream, Hallucination, Neural Style Transfer; CAM, Grad-CAM, Grad-CAM++; Recent Methods (IG, Segment-IG, SmoothGrad)
CNNs for Recognition, Verification, Detection, Segmentation: CNNs for Recognition and Verification (Siamese Networks, Triplet Loss, Contrastive Loss, Ranking Loss); CNNs for Detection: Background of Object Detection, R-CNN, Fast R-CNN, Faster R-CNN, YOLO, SSD, RetinaNet; CNNs for Segmentation: FCN, SegNet, U-Net, Mask-RCNN
Recurrent Neural Networks (RNNs): Review of RNNs; CNN + RNN Models for Video Understanding: Spatio-temporal Models, Action/Activity Recognition
Attention Models: Introduction to Attention Models in Vision; Vision and Language: Image Captioning, Visual QA, Visual Dialog; Spatial Transformers; Transformer Networks
Deep Generative Models: Review of (Popular) Deep Generative Models: GANs, VAEs; Other Generative Models: PixelRNNs, NADE, Normalizing Flows, etc
Variants and Applications of Generative Models in Vision: Applications: Image Editing, Inpainting, Superresolution, 3D Object Generation, Security; Variants: CycleGANs, Progressive GANs, StackGANs, Pix2Pix, etc
Recent Trends: Zero-shot, One-shot, Few-shot Learning; Self-supervised Learning; Reinforcement Learning in Vision; Other Recent Topics and Applications
+ Show all weeks

Prescribed Books

The following are the suggested books for the course:

Ian Goodfellow, Yoshua Bengio, Aaron Courville, Deep Learning, 2016

Michael Nielsen, Neural Networks and Deep Learning, 2016

Yoshua Bengio, Learning Deep Architectures for AI, 2009

Richard Szeliski, Computer Vision: Algorithms and Applications, 2010.

Simon Prince, Computer Vision: Models, Learning, and Inference, 2012.

David Forsyth, Jean Ponce, Computer Vision: A Modern Approach, 2002.

About the Instructors

Vineeth N B
Professor, Computer science and Engineering, IIT Hyderabad