MLSP Fall 2017

When	MW 3:30 - 5:00 pm
Where	EE B303 (Second class onwards)
Who	Sriram Ganapathy
Office	C 334 (2nd Floor)
Email	sriram aT ee doT iisc doT ernet doT in
Teaching Assistant	Aravind Illa
Lab	C 326 (2nd Floor)
Email	aravindece77 aT gmail doT com

Announcements

Final Exam will be on 10-12-2017 (0200pm-0500pm) B303 (Classroom).
- Open book, open notes. No laptops/cellphones allowed.
- Practice questions posted here
Project Evaluation 15-12-2017 (900am).
- Maximum 8 slides (single person) or 12 slides (2 persons) per project.
- Maximum 3-4 pages report (submit report and slides by noon Dec. 14 through mail).
- Evaluation criteria: Focus on problem definition and motivation, implementing the baseline and your contribution.
Feedback Form link
Fifth assignment
- Posted here Due - 24-11-2017.

Top

Syllabus

Introduction to real world signals - text, speech, image, video.
Feature extraction and front-end signal processing - information rich representations, robustness to noise and artifacts, signal enhancement, bio inspired feature extraction.
Basics of pattern recognition, Generative modeling - Gaussian and mixture Gaussian models, hidden Markov models, factor analysis.
Discriminative modeling - support vector machines, neural networks and back propagation.
Introduction to deep learning - convolutional and recurrent networks, pre-training and practical considerations in deep learning, understanding deep networks.
Deep generative models - Autoencoders, Boltzmann machines, Adverserial Networks.
Applications in computer vision and speech recognition.

Top

Grading Details

Assignments	15%
Midterm exam.	20%
Final exam.	35%
Project	30%

Pre-requisites

Random Process/Probablity and Statistics
Linear Algebra/Matrix Theory
Basic Digital Signal Processing/Signals and Systems

Top

Textbooks

“Pattern Recognition and Machine Learning”, C.M. Bishop, 2nd Edition, Springer, 2011.

“Neural Networks”, C.M. Bishop, Oxford Press, 1995.

“Deep Learning”, I. Goodfellow, Y, Bengio, A. Courville, MIT Press, 2016. html
“Digital Image Processing”, R. C. Gonzalez, R. E. Woods, 3rd Edition, Prentice Hall, 2008.
“Fundamentals of speech recognition”, L. Rabiner and H. Juang, Prentice Hall, 1993.

References

“Deep Learning : Methods and Applications”, Li Deng, Microsoft Technical Report.
“Automatic Speech Recognition - Deep learning approach” - D. Yu, L. Deng, Springer, 2014.
“Machine Learning for Audio, Image and Video Analysis”, F. Camastra, Vinciarelli, Springer, 2007. pdf

Top

Slides

14-08-2017	Introduction to real world signals - text, speech, image, video. Learning as a pattern recognition problem. Examples. Roadmap of the course.	slides
16-08-2017	Feature Extraction - Goals and challenges. Introduction to text processing. Bag of words model. Term Frequency- Inverse document frequency. N-gram modeling. Feature Extraction in Audio and Speech - Spectrogram.	slides
21-08-2017	Melfrequency cepstral Coefficients (MFCC), Linear Prediction - orthogonality of prediction error with past samples, optimal linear predictor, stability of prediction filter, Autoregressive process, linear prediction for AR process	slides
23-08-2017	Basics for Digital Image Processing – Filtering, Smoothing, Edge Detection, Scale Invariant Feature Transform (SIFT).	slides
28-08-2017	Matrix and vector derivatives - definition and properties. Dimensionality reduction - Preserving maximum data variance - principal component analysis (PCA). Minimum error formulation of PCA. Residual error in PCA. Example of PCA application for hand-written digit images. PRML - Bishop (Appendix, Chapter 12)	slides
30-08-2017	PCA for high dimensional data. Whitening and KL transform. Limitations of PCA. Class dependent dimensionality reduction using linear discriminant analysis (LDA). Fisher discriminant for 2 class case using within-class and between class matrices. Solution of LDA. Multi-class LDA, PCA versus LDA example. PRML - Bishop (Chapter 4.1.4)	slides
01-09-2017	Basics of Python Programming. Installing python, simple commands and functions. Loading speech and image data. Vectorizing, mean computation and spectogram.	slides code
01-09-2017	Assignment #1. Due on 11-09-2017. Analytical part submitted in class. Coding part submitted via e9205mlsp2017 aT gmail doT com.	HW1	image data	speech data
04-09-2017	Decision theory basics. Minimum classification error rule. MAP and ML based approaches. 3 approaches to ML. Generative versus discriminative modeling. Introduction to generative modeling. Multi-variate Gaussian Distribution. PRML - Bishop (Chapter 1.5)	slides
06-09-2017	MLE for multi-variate Gaussian. Sample mean and variance. Limitations of Gaussian modeling. Need for mixture modeling. Probability density of Gaussian Mixture Model (GMM).	slides Future Reading
11-09-2017	MLE for GMM - Expectation Maximization (EM) algorithm. Proof of EM algorithm. Convergence properties. EM algorithm for GMM parameter estimation. Choice of hidden variable. Ref - Tutorial GMMs Proof of EM algorithm EM algorithm for GMMs	slides
13-09-2017	Summary of GMM modeling. Application of GMM for unsupervised clustering.	slides
18-09-2017	Limitations of GMM modeling for sequence data. Markov Chains. Hidden Markov Model (HMM) definition. Three Problems in HMM. "Fundamentals of Speech Recog.", Rabiner and Juang (Chapter 6)
20-09-2017	Evaluating the likelihood using HMM (Problem 1), Complexity reduction using forward variable and backward variable. Finding the best state sequence (Problem 2) - instantaneous probabilility based, Viterbi algorithm for state sequence segmentation. "Fundamentals of Speech Recog.", Rabiner and Juang (Chapter 6)	Rabiner Tutorial on HMM
23-09-2017	Re-estimating the HMM parameters - EM algorithm for HMM (Problem 3). Q function definition and solution. Intuitions about HMM training. "Fundamentals of Speech Recog.", Rabiner and Juang (Chapter 6) EM algorithm for HMMs	Rabiner Tutorial on HMM
25-09-2016	Non-negative matrix factorization (NMF), problem definition, cost function and constraints. auxiliary function, proof of convergence, parameter update rule. Application to audio source separation and speech denoising. Refs - Bhiksha Raj-Tutorial Lee-Paper	slides
04-10-2017	First Mid-term Exam
09-10-2017	Application of NMF. Audio separation into individual instruments, speech denoising with known and unknown sources. Linear models for regression - problem definition. Least squares regression. Maximum likelihood and least squares regression. PRML - Bishop (Chapter 2)
11-10-2017	Overfitting and Underfitting. Regularized least squares. Linear Models for Classification. Least squares for classification. Sigmoid function and one-of-K encoding. Problems with least squares classification. PRML - Bishop (Chapter 3)	slides
16-10-2017	Logistic regression - two class problem. Sigmoid function and posterior probability. Logistic regression - K class problem. Softmax function and cross entropy error function and Maximum likelihood estimation. Linear regression revisited - dual formulation. PRML - Bishop (Chapter 4,6)
21-10-2017	Design matrix, kernel function and Gram matrix. Neccessary and sufficient condition for kernel functions (Mercer's theorem), Examples of kernel functions. PRML - Bishop (Chapter 6)
23-10-2017	Margin of linear classifier. Maximum margin classifier formulation. Constraints involved in optimization. Introduction to support vector machines PRML - Bishop (Chapter 7)
25-10-2017	Introduction to constrained optimization. Primal and dual problems. Weak and strong duality. Neccessary and sufficient conditions for strong duality for convex problems with convex conditions. KKT conditions. Introduction to convex optimization - Boyd (Chapter 5)	Weblink to the book
27-10-2017	Application of convex optimization to SVMs. KKT conditions and solution to problem. Definition of support vectors. Support vector machine for overlapping classes. Trade off in regularization and training loss. PRML - Bishop (Chapter 7)	slides
30-10-2017	SVM appication for classification. Support vector regression. Forumulation and KKT conditions. Introduction to neural networks. Parameter learning using gradient descent (scalar case). PRML - Bishop (Chapter 7)	slides
3-11-2017	Gradient descent vector case. Types of activation functions. XOR problem with NNs. Need for deep architecture neural networks. Deep Learning - IY (Chapter 6)
4-11-2017	Learning in Neural networks. First order methods - Method of steepest descent. Curvature and Hessians. Second order method - Newton method. Discussion on complexity of learning algorithms Deep Learning - IY (Chapter 4), Neural Networks - Bishop (Chapter 4,7)
6-11-2017	Back progation algorithm for learning in deep networks. Linear neuron with MSE algorithm. Disadvantages and limitations of gradient descent algorithm Neural Networks - Bishop (Chapter 6)
08-11-2017	Second Mid-term Exam.
10-11-2017	Types of non-linearities used. Cost function for regression and classification. Output activation function used in regression and classification. Equivalance between regression with MSE and classficiation with CE using softmax output activations. Neural Networks - Bishop (Chapter 6)
13-11-2017	Learning and Generarilzation issues in Neural networks. Decomposing the MSE into bias and variance. Discussion on bias variance tradeoff. Improving learning with regularization. Neural Networks - Bishop (Chapter 9)	slides
14-11-2017	Assignment #5. Due on 24-11-2017. Analytical part submitted in class. Coding part submitted via e9205mlsp2017 aT gmail doT com.	HW5	Data For HW5
15-11-2017	L2 weight regularization, early stopping and training with added noise in the input data. Committees of neural networks. System combination methods and optimization. Neural Networks - Bishop (Chapter 9)	slides
17-11-2017	Improving the speed of convergence of gradient descent with momentum. Convolutional neural networks. Kernels, pooling and sub-sampling. Comparision of CNNs and DNNs. Weight sharing and parameter learning. Deep Learning - IY (Chapter 9)
18-11-2017	Understanding the learning in deep layers of CNNs. Recurrent networks. Backpropagation in time for RNN parameter learning. Various RNN architectures - teacher forcing, sequence-to-vector and bi-directional RNNs. Deep Learning - IY (Chapter 10)	slides
20-11-2017	Long short term memory networks. Deep unsupervised learning - Restricted Boltzmann Machines (RBMs). Conditional independence in RBMs. Learning in RBM with maximum likelihood. Postive and negative partition function. Gibbs sampling and contrastive divergence approximations. Deep Learning - IY (Chapter 18,20)	slides

Top