Registry Synced

BSDA5006 - Deep Learning for Computer Vision

488 words
2 min read
FieldValue
Course CodeBSDA5006
LevelDegree Level Course
Credits4
TypeElective
Pre-requisitesNone

📖 Description

-Knowledge of basics of image processing and computer vision -Knowledge of building blocks of deep learning including feedforward networks, convolutional neural networks, recurrent neural networks and transformers -Knowledge of generative AI models in computer vision -Knowledge of recent trends including explainability/zero-shot learning, few-shot learning, self-supervised learning, etc -Hands-on experience on implementation of basic image processing tasks -Hands-on experience on implementation of deep learning models for computer vision tasks -Hands-on experience on implementation of advanced computer vision tasks such as explainability, self-supervised learning, etc

🗓️ Weekly Syllabus

WeekTopic
Week 1Introduction and Overview:
Course Overview and Motivation; Introduction to Image Formation, Capture and
Representation; Linear Filtering, Correlation,
Week 2Visual Features and Representations:
Edge, Blobs, Corner Detection; Scale Space and Scale Selection; SIFT, SURF; HoG,LBP, etc.
Week 3Visual Matching:
Bag-of-words, VLAD; RANSAC, Hough transform; Pyramid Matching; Optical Flow
Week 4Deep Learning Review:
Review of Deep Learning, Multi-layer Perceptrons, Backpropagation
Week 5Convolutional Neural Networks (CNNs):
Introduction to CNNs; Evolution of CNN Architectures: AlexNet, ZFNet, VGG,
InceptionNets, ResNets, DenseNets
Week 6Visualization and Understanding CNNs:
Visualization of Kernels; Backprop-to-image/Deconvolution Methods; Deep Dream,
Hallucination, Neural Style Trans
Week 7CNNs for Recognition, Verification, Detection, Segmentation:
CNNs for Recognition and Verification (Siamese Networks, Triplet Loss, Contrastive
Loss,
Week 8Recurrent Neural Networks (RNNs):
Review of RNNs; CNN + RNN Models for Video Understanding: Spatio-temporal
Models, Action/Activity Recognition
Week 9Attention Models:
Introduction to Attention Models in Vision; Vision and Language: Image Captioning,
Visual QA, Visual Dialog; Spatial Transformers; T
Week 10Deep Generative Models:
Review of (Popular) Deep Generative Models: GANs, VAEs; Other Generative Models:
PixelRNNs, NADE, Normalizing Flows, etc
Week 11Variants and Applications of Generative Models in Vision:
Applications: Image Editing, Inpainting, Superresolution, 3D Object Generation, Security;
Va
Week 12Recent Trends:
Zero-shot, One-shot, Few-shot Learning; Self-supervised Learning; Reinforcement
Learning in Vision; Other Recent Topics and Application

📚 Books & Resources

Prescribed Books The following are the suggested books for the course:
        Ian Goodfellow, Yoshua Bengio, Aaron Courville, Deep Learning, 2016

        
        Michael Nielsen, Neural Networks and Deep Learning, 2016
        
        Yoshua Bengio, Learning Deep Architectures for AI, 2009
        
        Richard Szeliski, Computer Vision: Algorithms and Applications, 2010.
        
        Simon Prince, Computer Vision: Models, Learning, and Inference, 2012.
        
        David Forsyth, Jean Ponce, Computer Vision: A Modern Approach, 2002.

📝 About the Instructors

Prof. Vineeth N B
Professor,
Computer science and Engineering,
IIT Hyderabad

Document Outline
Table of Contents
System Normal // Awaiting Context

Intelligence Hub

Navigate the knowledge graph to generate context. The Hub adapts dynamically to surface backlinks, related notes, and metadata insights.