E9:205 Machine Learning for Signal Processing

Announcements       Syllabus       Grading       Textbooks       Slides      



Timing MW 3:30 - 5:00 pm
Location EE C241 (MMCR 1st Floor)
Instructor Sriram Ganapathy
Office C 334 (2nd Floor)
Email sriram aT ee doT iisc doT ernet doT in
Teaching Assistant Achuth Rao
Lab C 326 (2nd Floor)
Email achuthraomv aT gmail doT com
TA Hours Thu 3-5 pm

Announcements

  • Final exam date and time Dec 7 1:30pm-4:30pm MMCR
          - Open book, open notes. No laptops/cellphones allowed.
  • Take home exam-2 posted here -
  • Project evaluation Dec 19 930am-1pm MMCR
          - Single person projects (max 10 min presentation) (max 5 slides)
          - Multi person projects - both individuals presenting (max 15 min presentation) (max 8 slides)
          - Project components - Implementation of baseline paper, comparing results with baseline paper, novel directions improving the baseline.
          - Project report (max single column 5 pages) - Due date Dec 17
          - Project slides emailed by Dec 18
          - Mark distribution (Total Marks 30) - Mid-term Evaluation (7), Final presentation (5), Report (5), Baseline implementation (7), Novelty (6).
Top      

Syllabus

  • Introduction to real world signals - text, speech, image, video.
  • Feature extraction and front-end signal processing - information rich representations, robustness to noise and artifacts, signal enhancement, bio inspired feature extraction.
  • Basics of pattern recognition, Generative modeling - Gaussian and mixture Gaussian models, hidden Markov models, factor analysis and latent variable models.
  • Discriminative modeling - support vector machines, neural networks and back propagation.
  • Introduction to deep learning - convolutional and recurrent networks, pre-training and practical considerations in deep learning, understanding deep networks.
  • Clustering methods and decision trees. Feature selection methods.
  • Applications in computer vision and speech recognition.
Top      

Grading Details

Assignments 15%
Midterm exam. 20%
Final exam. 35%
Project 30%

Pre-requisites

  • Random Process/Probablity and Statistics
  • Linear Algebra/Matrix Theory
  • Basic Digital Signal Processing/Signals and Systems
Top      

Textbooks

References

  • “Deep Learning : Methods and Applications”, Li Deng, Microsoft Technical Report.
  • “Automatic Speech Recognition - Deep learning approach” - D. Yu, L. Deng, Springer, 2014.
  • “Machine Learning for Audio, Image and Video Analysis”, F. Camastra, Vinciarelli, Springer, 2007. pdf
Top      

Slides



03-08-2016 Introduction to real world signals - text, speech, image, video. Learning as a pattern recognition problem. Examples. Roadmap of the course.

slides
08-08-2016 Types of learning methods, feature extraction for speech and audio, short-term Fourier transform, narrow band and wideband spectrogram, time frequency resolution.
Refs - Dan Ellis-Tutorial     Ricardo-Tutorial

slides
10-08-2016 Uncorrelated noise in speech/audio, non-negative matrix factorization (NMF), problem definition, cost function and constraints. auxiliary function, proof of convergence, parameter update rule. Application to audio source separation and speech denoising.
Refs - Bhiksha Raj-Tutorial     Lee-Paper

slides
17-08-2016 Linear Prediction - orthogonality of prediction error with past samples, optimal linear predictor, Yule-Walker Equations, Energy of prediction error, stability of prediction filter, Autoregressive process, linear prediction for AR process
Ref - Theory of LP - Vaidyanathan [Chap - 2, 5.3, A, B]
22-08-2016
Normal equations for Autoregressive process. Power spectral density. Autoregressive Modeling of PSD. Applications of linear prediction.

First assignment- Non-negative Matrix Factorization, Linear Prediction, Applications for face images and noisy speech.
Due Date - 02-09-2016 (Noon)
slides


HW1.pdf
images.zip speech.zip
24-08-2016
Matrix derivative rules. Dimensionality Reduction I - Principal component analysis (PCA), maximum variance formulation, minimum error formulation. Whitening and standardization, PCA for high dimensional data. Linear discriminant analysis (LDA), Fisher discriminant for two classes.
Ref - PRML - Bishop

29-08-2016 LDA for multiple classes, LDA formulation in lower dimensional subspace. Applications of PCA. Distinction between PCA and LDA. Introduction to feature extraction from image data - Wavelet transform, mother wavelet, scaling and shifting, Continuous and Dyadic Wavelet Transform.
Ref - Introduction to Wavelets and Wavelet Transforms - Burrus et al.

31-08-2016 Dyadic Wavelet Transform. Scaling and Wavelet Function. Approximation and Detail. Wavelet decomposition. Application to 1-D signals
Ref -Selected Pages - Burrus et al (Chap. 2)

handout
05-09-2016 Intrepreting wavelet approximation and detail coefficients. Filter bank approach to Wavelets. Extentsion to 2-D Wavelet Transform, Application to Images
Ref - Tutorial on 2-D Wavelets
Image Denoising

handout
07-09-2016 Decision Theory - Inference and decision rule, mis-classification error, maximum posterior decision rule, expected loss, minimum mean square error decision rule for regression. Three approaches to inference and decision - Generative modeling, Discriminative modeling and Discriminant Functions.
Ref - PRML - Bishop (Sec. 1.5)

12-09-2016 Introduction to generative modeling. Gaussian distribution. Parameter estimation using maximum likelihood (MLE). Sample mean and sample covariance. Limitations of Gaussian modeling. Gaussian mixture model (GMM) density function.

slides
14-09-2016 MLE for GMM - Expectation Maximization (EM) algorithm. Proof of EM algorithm. Convergence properties. EM algorithm for GMM parameter estimation. Choice of hidden variable. Application of GMMs for unsupervised data clustering.
Ref - Tutorial GMMs
Proof of EM algorithm
EM algorithm for GMMs

slides
19-09-2016 Markov chain - sequence modeling with hidden Markov modeling (HMM). Definition of HMM parameters. Three problems in HMM (i) Evaluation (ii) Inference and (iii) Training. Direct computation of likelihood. Forward and backward variable recursion.
Ref - Rabiner, Juang, "Fundamentals of speech recognition",Chap 6
Ref - SP Magazine Article - Rabiner

21-09-2016 Solution to problem (ii) in HMM - Viterbi algorithm. HMM parameter estimation with EM algorithm. Estimation of Q function and iterative model update.
Ref - Tutorial HMMs
Rabiner, Juang, "Fundamentals of speech recognition",Chap 6

Second assignment (Part A)- PCA/LDA, ML, Gaussian and GMM, HMM
Due Date - 03-10-2016 (Class)





HW2-a.pdf
26-09-2016 First Mid-term Exam


28-09-2016
Discussion on first mid-term exam. Topics for mini-projects
Project list
03-10-2016 Hidden Markov Models with GMM observation densities. Application of EM algorithm for GMM-HMMs. Parameter estimation. Application of HMMs in video analysis. Dimensionality reduction continued - latent variable models.
Refs - GMM-HMM - "Fundamentals of Speech Recognition", Rabiner, Chap 6.
Slides from N. Ramanathan -Video analysis with HMMs

05-10-2016 Probablistic PCA (PPCA) - generative model desciption. Log-likelihood computation, Parameter estimation using direct optimization. EM algorithm for PPCA. Extension to factor analysis. Summary of generative modeling. Introduction to discriminative modeling - Non-linear regression with kernels.
Ref - PRML - Bishop (Sec. 12.2, 3.1)
Paper - "PPCA", Tipping et al

12-10-2016 Recap of generative versus discriminative modeling. Non-linear regression with regularization. Dual problem definition and solution with kernels. Properties of kernel functions. Constructing kernels from basic blocks. Sparse kernel machines.
Ref - PRML - Bishop (Sec. 3.3, 6)

slides
17-10-2016 Classifiers with kernels. Definition of margin. Maximum margin classifiers. Introduction to convex optimization with constraints. Primal and dual problems. Weak and strong duality. Karush-Kuhn-Tucker (KKT) conditions for strong duality. Solving the dual problem for maximum margin classifiers. Definition of support vectors.
Ref - PRML - Bishop (Chap 7.1)
Book (Chap 5) - "Convex Optimization", Boyd and Vandenberghe
Second assignment (Part B)- Implementing PCA/LDA, GMM and HMM
Due Date - 28-10-2016 (Noon)





HW2-b.pdf
19-10-2016 Maximum margin classifiers for overlapping class distributions, concept of slack variables. Lagrangian and dual form. KKT conditions for solving the optimal parameters. Sequential minimal optimization algorithm - analytic solution to two variable constrained optimization problem, heuristics for choosing the two variables. Estimating the bias parameter in SVM.
Ref - PRML - Bishop (Chap 7.1.1)
SMO paper - J. Platt et al.


24-10-2016 Summary of support vector machines - problem definition, primal and dual formulations, kernel space transformation, solutions and implications, applications of SVMs in cancer diagnonsis and text categorization. Support vector regression - slack variables and dual formulation.
Ref - PRML - Bishop (Chap 7.1.4)
NYU Bio medicine - Tutorial

slides
31-10-2016 Introduction to neural networks. Illustration with XOR problem - need for hidden layer(s) with non-linear activations. Optimization methods for neural networks. First order Taylor series - Gradient descent method. curvature and second derviatives. Jacobian and Hessian matrices. Newton's method. Stochastic gradient descent.
Ref - DLB (Deep Learning Book) - Goodfellow, Bengio (Chap 6, Chap 4.3)

02-11-2016 Neural networks estimate posterior probablities. Architecture considerations - cost function (mean square error, cross entropy), output units (linear, sigmoidal or softmax), hidden unit activations (ReLU and variants, tanh or sigmoidal).
Ref - DLB (Deep Learning Book) - Goodfellow, Bengio (Chap 6)


07-11-2016 Universal approximation properties of NNs. Need for multiple hidden layers. Depth versus width. Mechanism of representation learning in deep networks. Parameter learning in deep networks - back propagation. Equivalence in learning DNNs with linear output activation and MSE versus softmax activations with cross entropy error.
Ref - DLB (Deep Learning Book) - Goodfellow, Bengio (Chap 6)
ASR - DL approach, D. Yu, Li Deng (Chap 4).


09-11-2016 Summary of NN learning and architecture. Psuedo code for back propagation. Other considerations - data preprocessing, model initialization. Underfit versus overfit. Improving generalization with regularization. L2 regularization. Quadratic approximation and
Ref - DLB (Deep Learning Book) - Goodfellow, Bengio (Chap 6, 7)


slides
14-11-2016 L1 regularization. Multi-task learning. Early stoppping. Equivalence between L2 regularization and early stopping. Bagging and ensemble averaging. Dropout.
Ref - DLB (Deep Learning Book) - Goodfellow, Bengio (Chap 7)

16-11-2016 Convolutional neural networks. Filtering and hierarchical sparsity. Pooling and striding. Deep convolutional networks.
Discussion on second mid-term exam
Ref - DLB (Deep Learning Book) - Goodfellow, Bengio (Chap 9)

21-11-2016 Deep Generative Models - Restricted Boltzmann Machine, model definition, conditional independence. Relationship with sigmoidal activation. Parameter learning in RBM - positive and negative phase, approximation with sampling methods, contrastive divergence algorithm. Deep Belief Networks (DBNs).
Ref - DLB (Deep Learning Book) - Goodfellow, Bengio (Chap 18,20)

23-11-2016 Gaussian Restricted Boltzmann Machine (GRBM). Relationship with GMMs. Summary of Deep learning methods.
Ref - ASR - Deep Learning Approach (Yu and Deng) - (Chap 5)

slides
29-11-2016 Take Home Practice Exam

Q-paper
                                                                                           
Top