When | MW 3:30 - 5:00 pm |
Where | EE B303 (Second class onwards) |
Who | Sriram Ganapathy |
Office | C 334 (2nd Floor) |
sriram aT ee doT iisc doT ernet doT in | |
Teaching Assistant | Aravind Illa |
Lab | C 326 (2nd Floor) |
aravindece77 aT gmail doT com |
Announcements
- Final Exam will be on 10-12-2017 (0200pm-0500pm) B303 (Classroom).
      - Open book, open notes. No laptops/cellphones allowed.
      - Practice questions posted here - Project Evaluation 15-12-2017 (900am).
      - Maximum 8 slides (single person) or 12 slides (2 persons) per project.
      - Maximum 3-4 pages report (submit report and slides by noon Dec. 14 through mail).
      - Evaluation criteria: Focus on problem definition and motivation, implementing the baseline and your contribution.
- Feedback Form link
- Fifth assignment
      - Posted here   Due - 24-11-2017.
Top      
Syllabus
- Introduction to real world signals - text, speech, image, video.
- Feature extraction and front-end signal processing - information rich representations, robustness to noise and artifacts, signal enhancement, bio inspired feature extraction.
- Basics of pattern recognition, Generative modeling - Gaussian and mixture Gaussian models, hidden Markov models, factor analysis.
- Discriminative modeling - support vector machines, neural networks and back propagation.
- Introduction to deep learning - convolutional and recurrent networks, pre-training and practical considerations in deep learning, understanding deep networks.
- Deep generative models - Autoencoders, Boltzmann machines, Adverserial Networks.
- Applications in computer vision and speech recognition.
Top      
Grading Details
Assignments | 15% |
Midterm exam. | 20% |
Final exam. | 35% |
Project | 30% |
Pre-requisites
- Random Process/Probablity and Statistics
- Linear Algebra/Matrix Theory
- Basic Digital Signal Processing/Signals and Systems
Top      
Textbooks
- “Pattern Recognition and Machine Learning”, C.M. Bishop, 2nd Edition, Springer, 2011.
- “Neural Networks”, C.M. Bishop, Oxford Press, 1995.
- “Deep Learning”, I. Goodfellow, Y, Bengio, A. Courville, MIT Press, 2016. html
- “Digital Image Processing”, R. C. Gonzalez, R. E. Woods, 3rd Edition, Prentice Hall, 2008.
- “Fundamentals of speech recognition”, L. Rabiner and H. Juang, Prentice Hall, 1993.
References
- “Deep Learning : Methods and Applications”, Li Deng, Microsoft Technical Report.
- “Automatic Speech Recognition - Deep learning approach” - D. Yu, L. Deng, Springer, 2014.
- “Machine Learning for Audio, Image and Video Analysis”, F. Camastra, Vinciarelli, Springer, 2007. pdf
Top      
Slides
14-08-2017 | Introduction to real world signals - text, speech, image, video. Learning as a pattern recognition problem. Examples. Roadmap of the course. |
slides |
||
16-08-2017 | Feature Extraction - Goals and challenges. Introduction to text
processing. Bag of words model. Term Frequency- Inverse document
frequency. N-gram modeling. Feature Extraction in Audio and Speech -
Spectrogram. |
slides |
||
21-08-2017 | Melfrequency cepstral Coefficients (MFCC), Linear Prediction - orthogonality of prediction error with past samples, optimal linear predictor, stability of prediction filter, Autoregressive process, linear prediction for AR process |
slides |
||
23-08-2017 | Basics for Digital Image Processing – Filtering, Smoothing, Edge Detection, Scale Invariant Feature Transform (SIFT). |
slides |
||
28-08-2017 | Matrix and vector derivatives - definition and properties. Dimensionality reduction - Preserving maximum data variance - principal component analysis (PCA). Minimum error formulation of PCA. Residual error in PCA. Example of PCA application for hand-written digit images. PRML - Bishop (Appendix, Chapter 12) |
slides |
||
30-08-2017 | PCA for high dimensional data. Whitening and KL transform. Limitations of PCA. Class dependent dimensionality reduction using linear discriminant analysis (LDA). Fisher discriminant for 2 class case using within-class and between class matrices. Solution of LDA. Multi-class LDA, PCA versus LDA example. PRML - Bishop (Chapter 4.1.4) |
slides |
||
01-09-2017 | Basics of Python Programming. Installing python, simple commands and functions. Loading speech and image data. Vectorizing, mean computation and spectogram. |
slides code |
||
01-09-2017 | Assignment #1. Due on 11-09-2017. Analytical part submitted in class. Coding part submitted via e9205mlsp2017 aT gmail doT com. |
HW1 | image data | speech data |
04-09-2017 | Decision theory basics. Minimum classification error rule. MAP and ML based approaches. 3 approaches to ML. Generative versus discriminative modeling. Introduction to generative modeling. Multi-variate Gaussian Distribution. PRML - Bishop (Chapter 1.5) |
slides |
||
06-09-2017 |
MLE for multi-variate Gaussian. Sample mean and variance. Limitations of Gaussian modeling. Need for mixture modeling. Probability density of Gaussian Mixture Model (GMM).
|
slides Future Reading |
||
11-09-2017 | MLE for GMM - Expectation Maximization (EM) algorithm. Proof of EM algorithm. Convergence properties. EM algorithm for GMM parameter estimation. Choice of hidden variable. Ref - Tutorial GMMs Proof of EM algorithm EM algorithm for GMMs |
slides | ||
13-09-2017 | Summary of GMM modeling. Application of GMM for unsupervised clustering. |
slides | ||
18-09-2017 | Limitations of GMM modeling for sequence data. Markov Chains. Hidden Markov Model (HMM) definition. Three Problems in HMM. "Fundamentals of Speech Recog.", Rabiner and Juang (Chapter 6) |
|||
20-09-2017 | Evaluating the likelihood using HMM (Problem 1), Complexity reduction using forward variable and backward variable. Finding the best state sequence (Problem 2) - instantaneous probabilility based, Viterbi algorithm for state sequence segmentation. "Fundamentals of Speech Recog.", Rabiner and Juang (Chapter 6) |
Rabiner Tutorial on HMM | ||
23-09-2017 | Re-estimating the HMM parameters - EM algorithm for HMM (Problem 3). Q function definition and solution. Intuitions about HMM training. "Fundamentals of Speech Recog.", Rabiner and Juang (Chapter 6) EM algorithm for HMMs |
Rabiner Tutorial on HMM | ||
25-09-2016 | Non-negative matrix factorization (NMF), problem definition, cost function and constraints. auxiliary function, proof of convergence, parameter update rule. Application to audio source separation and speech denoising. Refs - Bhiksha Raj-Tutorial     Lee-Paper |
slides | ||
04-10-2017 | First Mid-term Exam |
|||
09-10-2017 | Application of NMF. Audio separation into individual instruments, speech denoising with known and unknown sources. Linear models for regression - problem definition. Least squares regression. Maximum likelihood and least squares regression. PRML - Bishop (Chapter 2) |
|||
11-10-2017 | Overfitting and Underfitting. Regularized least squares. Linear Models for Classification. Least squares for classification. Sigmoid function and one-of-K encoding. Problems with least squares classification. PRML - Bishop (Chapter 3) |
slides |
||
16-10-2017 | Logistic regression - two class problem. Sigmoid function and posterior probability. Logistic regression - K class problem. Softmax function and cross entropy error function and Maximum likelihood estimation. Linear regression revisited - dual formulation.
PRML - Bishop (Chapter 4,6) |
|||
21-10-2017 | Design matrix, kernel function and Gram matrix. Neccessary and sufficient condition for kernel functions (Mercer's theorem), Examples of kernel functions.
PRML - Bishop (Chapter 6) |
|||
23-10-2017 | Margin of linear classifier. Maximum margin classifier formulation. Constraints involved in optimization. Introduction to support vector machines
PRML - Bishop (Chapter 7) |
|||
25-10-2017 | Introduction to constrained optimization. Primal and dual problems. Weak and strong duality. Neccessary and sufficient conditions for strong duality for convex problems with convex conditions. KKT conditions. Introduction to convex optimization - Boyd (Chapter 5) |
Weblink to the book
|
||
27-10-2017 | Application of convex optimization to SVMs. KKT conditions and solution to problem. Definition of support vectors. Support vector machine for overlapping classes. Trade off in regularization and training loss.
PRML - Bishop (Chapter 7) |
slides |
||
30-10-2017 | SVM appication for classification. Support vector regression. Forumulation and KKT conditions.
Introduction to neural networks. Parameter learning using gradient descent (scalar case).
PRML - Bishop (Chapter 7) |
slides |
||
3-11-2017 | Gradient descent vector case. Types of activation functions. XOR problem with NNs. Need for deep architecture neural networks.
Deep Learning - IY (Chapter 6) |
|||
4-11-2017 | Learning in Neural networks. First order methods - Method of steepest descent. Curvature and Hessians. Second order method - Newton method. Discussion on complexity of learning algorithms
Deep Learning - IY (Chapter 4), Neural Networks - Bishop (Chapter 4,7) |
|||
6-11-2017 | Back progation algorithm for learning in deep networks. Linear neuron with MSE algorithm. Disadvantages and limitations of gradient descent algorithm Neural Networks - Bishop (Chapter 6) |
|||
08-11-2017 | Second Mid-term Exam.
|
|||
10-11-2017 | Types of non-linearities used. Cost function for regression and classification. Output activation function used in regression and classification. Equivalance between regression with MSE and classficiation with CE using softmax output activations. Neural Networks - Bishop (Chapter 6) |
|||
13-11-2017 | Learning and Generarilzation issues in Neural networks. Decomposing the MSE into bias and variance. Discussion on bias variance tradeoff. Improving learning with regularization. Neural Networks - Bishop (Chapter 9) |
slides |
||
14-11-2017 | Assignment #5. Due on 24-11-2017. Analytical part submitted in class. Coding part submitted via e9205mlsp2017 aT gmail doT com. |
HW5 |
Data For HW5 |
|
15-11-2017 | L2 weight regularization, early stopping and training with added noise in the input data. Committees of neural networks. System combination methods and optimization. Neural Networks - Bishop (Chapter 9) |
slides |
||
17-11-2017 | Improving the speed of convergence of gradient descent with momentum. Convolutional neural networks. Kernels, pooling and sub-sampling. Comparision of CNNs and DNNs. Weight sharing and parameter learning. Deep Learning - IY (Chapter 9) |
|||
18-11-2017 | Understanding the learning in deep layers of CNNs. Recurrent networks. Backpropagation in time for RNN parameter learning. Various RNN architectures - teacher forcing, sequence-to-vector and bi-directional RNNs. Deep Learning - IY (Chapter 10) |
slides |
||
20-11-2017 |
Long short term memory networks. Deep unsupervised learning - Restricted Boltzmann Machines (RBMs). Conditional independence in RBMs. Learning in RBM with maximum likelihood. Postive and negative partition function. Gibbs sampling and contrastive divergence approximations.
Deep Learning - IY (Chapter 18,20) |
slides |
||
                                                                |                 |             |
Top