Publications

Last Updated: November 2025.

An up to date list of all publications can be found on Google Scholar profile.

Thesis

Investigating Neural Mechanisms of Word Learning and Speech Perception

A.Soman

Indian Institute of Science, April. 2024.

Graph Clustering Approaches for Speaker Diarization of Conversational Speech

P.Singh

Indian Institute of Science, Feb. 2024.

Dereverberation of Speech Using Autoregressive Models of Sub-band Envelopes

Anurenjan P.R.

Indian Institute of Science, Sep. 2023.

Supervised Learning Approaches for Language and Speaker Recognition

S.Ramoji

Indian Institute of Science, July 2023.

Neural Representation Learning for Speech and Audio Signals

P. Agrawal

Indian Institute of Science, Jan. 2021.

Signal Analysis using Autoregressive Models of Amplitude Modulation

S. Ganapathy

Johns Hopkins University, Jan. 2012.

Tutorials, Keynotes, Defense and Colloquia

Investigating Neural Mechanisms of Word Learning and Speech Perception

A.Soman

Thesis Defense Talk, February 2024.

Graph Clustering Approaches for Speaker Diarization of Conversational Speech

P.Singh

Thesis Defense Talk, February 2024.

Dereverberation of Speech Using Autoregressive Models of Sub-band Envelopes

Anurenjan P.R.

Thesis Defense Talk, September 2023.

Graph Clustering approaches for Speaker Diarization of Conversational Speech

P. Singh

Thesis Colloquium Talk, July 2023.

Investigating Neural Mechanisms of Word Learning and Speech Perception

A. Soman

Thesis Colloquium Talk, July 2023.

Supervised Approaches for Language and Speaker Recognition

S. Ramoji

Thesis Defense Talk, July 2023.

Neural Representation Learning for Speech and Audio Signals

P. Agrawal

Thesis Defense Talk, January 2021.

Neural Representation Learning of Speech and Audio Signals

P. Agrawal

Thesis Colloquium Talk, July 2020.

Speaker and Language Recognition - From Laboratory Technologies to the Wild

S. Ganapathy

Invited Perspective Keynote Talk, Interspeech 2018.

The Art and Science of Speech Feature Engineering

S. Ganapathy and S. Thomas

Interspeech, Singapore, Sept. 2014.

Journals

Leveraging Content and Acoustic Representations for Speech Emotion Recognition

Soumya Dutta, Sriram Ganapathy

IEEE Transactions on Audio, Speech and Language Processing, 2025.

End-to-End Supervised Hierarchical Graph Clustering for Speaker Diarization

Prachi Singh, Sriram Ganapathy

IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024.

Gradient-free Post-hoc Explainability Using Distillation Aided Learnable Approach

Debarpan Bhattacharya, Amir H. Poorjam, Deepak Mittal, Sriram Ganapathy

IEEE Journal of Selected Topics in Signal Processing (JSTSP)- Special Series on AI in Signal & Data Science, 2024.

Summary of the DISPLACE challenge 2023-DIarization of SPeaker and LAnguage in Conversational Environments

Baghel, Shikha, Shreyas Ramoji, Somil Jain, Pratik Roy Chowdhuri, Prachi Singh, Deepu Vijayasenan, and Sriram Ganapathy

Elsevier Speech Communication 161 (2024): 103080.

Representation Learning With Hidden Unit Clustering For Low Resource Speech Applications

Varun Krishna, Tarun Sai, Sriram Ganapathy

IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2023.

Speech Dereverberation with Frequency Domain Autoregressive Modeling

Anurenjan Purushothaman, Debottam Dutta, Rohit Kumar, Sriram Ganapathy

IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2023.

Multi-Modal Point-of-Care Diagnostics for COVID-19 Based on Acoustics and Symptoms

S. R. Chetupalli, P. Krishnan, N.K.Sharma, A.Muguli, R.Kumar, V.Nanda, L.M.Pinto, P.K.Ghosh, S. Ganapathy

IEEE Journal of Translational Engineering in Health and Medicine, 2023.

Coswara: A respiratory sounds and symptoms dataset for remote screening of SARS-CoV-2 infection

D. Bhattacharya, N. K. Sharma, D. Dutta, S. R. Chetupalli, P. Mote, S. Ganapathy, S. Nori, S. Gonuguntla, M. Alagesan

Nature Scientific Data, 2023.

PLDA inspired Siamese networks for speaker verification

Ramoji, Shreyas, Prashant Krishnan, and Sriram Ganapathy

Computer Speech and Language, 2022.

ERP Evidences of Rapid Semantic Learning In Foreign Language Word Comprehension

Akshara Soman, Prathibha Ramachandran, and Sriram Ganapathy

Frontiers in Neuroscience, 2022.

Towards sound based testing of COVID-19 -Summary of the first Diagnostics of COVID-19 using Acoustics (DiCOVA) Challenge

Neeraj Kumar Sharma, Ananya Muguli, Prashant Krishnan, Rohit Kumar, Srikanth Raj Chetupalli, Sriram Ganapathy

Elsevier Journal on Computer, Speech and Language, 2022.

Dereverberation of Autoregressive Envelopes for Far-field Speech Recognition

Purushothaman, A., Sreeram, A., Kumar, R., & Ganapathy, S

Elsevier Journal on Computer, Speech and Language, 2021.

Deep Correlation Analysis for Audio-EEG Decoding

JR Katthi, S. Ganapathy

IEEE Transactions on Neural Systems and Rehabilitation Engineering, 2021.

Self-supervised Representation Learning With Path Integral Clustering For Speaker Diarization

P. Singh, S. Ganapathy

IEEE Transactions and Audio, Speech and Language Processing, 2021.

Acoustic and linguistic features influence talker change detection

N. Sharma, V. Krishnamohan, S. Ganapathy, A. Gangopadhyay, L. Fink

Journal of Acoustic Society of America (JASA) - Express Letters, EL414, Oct. 2020.

Interpretable Representation Learning for Speech and Audio Signals Based on Relevance Weighting

P. Agrawal, S. Ganapathy

IEEE Transactions and Audio, Speech and Language Processing, 2020.

Automatic Speaker Profiling from Short Duration Speech Data

Shareef Babu Kalluri, Deepu Vijayasenan and S. Ganapathy

Elsevier Speech Communications, April 2020.

Towards Relevance and Sequence Modeling in Language Recognition

B. Padi, A. Mohan and S. Ganapathy

IEEE Transactions on Audio, Speech and Language Processing, March, 2020.

Supervised I-vector Modeling for Language and Accent Recognition

S. Ramoji and S. Ganapathy

Elsevier Journal on Computer, Speech and Language, Oct. 2019.

A Study on Pairwise LDA for X-vector based Speaker Recognition

A. Kanagasundaram, S. Sridharan, S. Ganapathy and C. Fookes

IET Electronic Letters, (2019).

Modulation Filter Learning Using Deep Variational Networks for Robust Speech Recognition

P. Agrawal and S. Ganapathy

IEEE Journal of Selected Topics in Signal Processing (J-STSP), Special Issue on Data Science: Machine Learning for Audio Signal Processing, April 2019.

An EEG Study On The Brain Representations in Language Learning

A. Soman, Madhavan C. R., K. Sarkar, and S. Ganapathy

IOP Journal on Biomedical Physics and Engineering Express, 5(2), 25041, (2019).

Talker change detection: A comparison of human and machine performance

N. Sharma, S. Ganesh, S. Ganapathy and L. Holt

Journal of Acoustical Society of America, December 2018.

Convolutional Neural Network based Robust Denoising of Low-Dose Computed Tomography Perfusion Maps

V. S. Kadimesetty, S. Gutta, S. Ganapathy, and P. K. Yalavarthy

IEEE Transactions on Radiation and Plasma Medical Sciences, August 2018.

Deep Neural Network Based Bandwidth Enhancement of Photoacoustic Data

S. Gutta, V.S. Kadimesetty, S. K. Kalva, M. Pramanik, S. Ganapathy and P. K. Yalavarthy

Journal of Biomedical Optics, October 2017.

Increasing the Robustness of CNN Acoustic Models using ARMA Spectrogram Features and Channel Dropout

G. Kocavs, L. Toth, D. V. Compernolle and S. Ganapathy

Elsevier Pattern Recognition Letters, September 2017.

Unsupervised Modulation Filter Learning for Noise-Robust Speech Recognition

P. Agrawal and S. Ganapathy

Journal of Acoustical Society of America, Sept. 2017.

Multi-variate Autoregressive Spectrogram Modeling for Noisy Speech Recognition

S. Ganapathy

IEEE Signal Processing Letters, July 2017.

Auditory Motivated Front-end for Noisy Speech Using Spectro-temporal Modulation Filtering

S. Ganapathy and M. Omar

Journal of Acoustical Society of America, EL343-349, Vol. 136(5), Nov. 2014.

Robust Feature Extraction Using Modulation Filtering of Autoregressive Models

S. Ganapathy, H. Mallidi and H. Hermansky

IEEE Transactions on Audio, Speech and Language Processing, Vol. 22(8), pp. 1285-1295, Aug. 2014.

Enhancing Frequency Shifted Speech Signals in Single Side Band Communication

S. Ganapathy and J. Pelecanos

IEEE Signal Processing Letters, Vol. 20(12), pp. 1231-1234, Oct. 2013.

Temporal Resolution Analysis in Frequency Domain Linear Prediction

S. Ganapathy and H. Hermansky

Journal of Acoustical Society of America, EL436-442, Vol. 132(5), Oct. 2012.

Temporal envelope compensation for robust phoneme recognition using modulation spectrum

S. Ganapathy, S. Thomas and H. Hermansky

Journal of Acoustical Society of America, Vol. 128(6), pp. 3769-3780, Dec. 2010.

Autoregressive Models Of Amplitude Modulations In Audio Compression

S. Ganapathy, P. Motlicek and H. Hermansky

IEEE Transactions on Audio, Speech and Language Processing, Vol. 18(6), pp.1624-1631, Aug. 2010.

Wide-Band Audio Coding based on Frequency Domain Linear Prediction

P. Motlicek, S. Ganapathy, H. Hermansky and H. Garudadri

EURASIP Journal on Audio, Speech, and Music Processing, Vol. 2010 (3), pp. 1-14, Jan. 2010.

Modulation Frequency Features For Phoneme Recognition In Noisy Speech

S. Ganapathy, S. Thomas and H. Hermansky

Journal of Acoustical Society of America, EL8-12, Vol. 125(1), Jan. 2009.

Recognition Of Reverberant Speech Using Frequency Domain Linear Prediction

S. Thomas, S. Ganapathy and H. Hermansky

IEEE Signal Processing Letters, Vol. 15, pp. 681-684, Dec 2008.

Conferences

ULTRAS - Unified Learning of Transformer Representations for Audio and Speech Signals

Ameenudeen P E, Charumathi Narayanan, S. Ganapathy

ASRU 2025, Honolulu, HI, USA.

FESTA: Functionally Equivalent Sampling for Trust Assessment of Multimodal LLMs

D. Bhattacharya, Apoorva Kulkarni, S. Ganapathy

EMNLP 2025, Suzhou, China. [A* Conference]

Towards Unbiased Evaluation of Time-series Anomaly Detector

D. Bhattacharya, Sumanta Mukherjee, Chandramouli Kamanchi, Vijay Ekambaram, Arindam Jati, Pankaj Dayama

ICASSP 2025, Hyderabd, India.

ABHINAYA - A System for Speech Emotion Recognition In Naturalistic Conditions Challenge

S. Dutta, S. Balaji, R. Varada, V. Salinamakki, S. Ganapathy

Interspeech 2025, Netherlands.

Benchmarking and Confidence Evaluation of LALMs For Temporal Reasoning

D. Bhattacharya, A. Kulkarni, S. Ganapathy

Interspeech 2025, Netherlands.

Spoken Language Understanding on Unseen Tasks With In-Context Learning

N. Agarwal, S. Ganapathy

Interspeech 2025, Netherlands.

LLM supervised Pre-training for Multimodal Emotion Recognition in Conversations

Soumya Dutta, Sriram Ganapathy

ICASSP 2025, Hyderabad, India.

Identifying and Mitigating Mismatched Language Code in Multilingual ASR

J. Kim, S. Mavandadi, K. Audhkhasi, S. Bharadwaj, B. Farris, T. Chen, B. Ramabhadran, S. Ganapathy

ICASSP 2025, Hyderabad, India.

Enhancing Customer Service Chatbots with Context-Aware NLU through Selective Attention and Multi-task Learning

Subhadip Nandi, Neeraj Agrawal, Anshika Singh, Priyanka Bhatt

CODS-COMAD 2024, Jodhpur, India.

Improving Few-Shot Cross-Domain Named Entity Recognition by Instruction Tuning a Word-Embedding based Retrieval Augmented Large Language Model

Subhadip Nandi, Neeraj Agrawal

EMNLP 2024, Florida.

Improving Self-supervised Pre-training using Accent-Specific Codebooks

D. Prabhu, A. Gupta, O. Nitsure, P. Jyothi, S. Ganapathy

Interspeech 2024, Kos Island, Greece.

The Second DISPLACE Challenge : DIarization of SPeaker and LAnguage in Conversational Environment

S. B. Kalluri, P. Singh, P.R. Chowdhuri, A. Kulkarni, S. Baghel, P. Hegde, S. Sontakke, Deepak K T, S.R.M. Prasanna, D. Vijayasenan, S. Ganapathy

Interspeech 2024, Kos Island, Greece.

LLM Augmented LLMs: Expanding Capabilities through Composition

Bansal, R., B. Samanta, S. Dalmia, N. Gupta, S. Ganapathy, A. Bapna, P. Jain, and P. Talukdar

In The Twelfth International Conference on Learning Representations (ICLR) 2024, Vienna Austria. [A* conference]

Zero Shot Audio to Audio to Audio Emotion Transfer with Speaker Disentanglement

Soumya Dutta and Sriram Ganapathy

ICASSP 2024, Seoul, South Korea

Multimodal modeling for spoken language identification

S.Bharadwaj, M.Ma, S.Vashishth, A.Bapna, S.Ganapathy, V.Axelrod, S.Dalmia, W.Han, Y.Zhang, D.V.Esch, S.Ritchie, P.Talukder, J.Riesa

ICASSP 2024, Seoul, South Korea.

Self-Influence Guided Data Reweighting for Language Model Pre-training

M.Thakkar, T.Bolukbasi, S.Ganapathy, S.Vashishth, S.Chandar, P.Talukdar

EMNLP 2023, Singapore. [A* Conference]

Accented Speech Recognition With Accent-specific Codebooks

D.Prabhu, P.Jyothi, S.Ganapathy, V.Unni

EMNLP 2023, Singapore. [A* Conference]

MASR:Multi-Label Aware Speech Representation

A.Raj, S. Bharadwaj, S. Ganapathy, M. Ma, S.Vashishth

IEEE ASRU 2023, Taiwan.

Pseudo-Label Based Supervised Contrastive Loss for Robust Speech Representations

Varun Krishna and Sriram Ganapathy

IEEE ASRU 2023, Taiwan.

Hierarchical Text Classification Using Contrastive Learning Informed Path Guided Hierarchy

Neeraj Agrawal, Saurabh Kumar, Priyanka Bhatt, Tanishka Agarwal

ECAI 2023, Poland.

Building a Few-Shot Cross-Domain Multilingual NLU Model for Customer Care

Saurabh Kumar, Sourav Bansal, Neeraj Agrawal, Priyanka Bhatt

ECAI 2023, Poland.

Label Aware Speech Representation Learning For Language Identification

S. Vashishth, S. Bharadwaj, S. Ganapathy, A. Bapna, M. Ma, W. Han, V. Axelrod, P. Talukdar

Interspeech 2023, Dublin, Ireland.

Enhancing the EEG Speech Match-Mismatch Tasks With Word Boundaries

Akshara Soman, Vidhi Sinha, and Sriram Ganapathy

Interspeech 2023, Dublin, Ireland.

DISPLACE Challenge: DIarization of SPeaker and LAnguage in Conversational Environments

S. Baghel, S. Ramoji, Sidharth, Ranjana H, P. Singh, S. Jain, P. R. Chowdhuri, K. Kulkarni, S. Padhi, D. Vijayasenan and S. Ganapathy

Interspeech 2023, Dublin, Ireland.

Supervised Hierarchical Clustering Using Graph Neural Networks for Speaker Diarization

Prachi Singh ,Amrit Kaul and Sriram Ganapathy

ICASSP 2023, Rhodes Island, Greece.

Interpretable Acoustic Representation Learning on Breathing and Speech Signals for COVID-19 Detection

Debottam Dutta ,Debarpan Bhattacharya ,Sriram Ganapathy, Amir H. Poorjam, Deepak Mittal, and Maneesh Singh

Interspeech 2022, Incheon, South Korea.

Analyzing the impact of SARS-CoV-2 variants on respiratory sound signals

Debarpan Bhattacharya, Debottam Dutta ,Neeraj Kumar Sharma ,Srikanth Raj Chetupalli ,Pravin Mote, Sriram Ganapathy ,Chandrakiran C ,Sahiti Nori ,Suhail K K ,Sadhana Gonuguntla ,and Murali Alagesan

Interspeech 2022, Incheon, South Korea.

Coswara: A website application enabling COVID-19 screening by analysing respiratory sound samples and health symptoms

Debarpan Bhattacharya ,Debottam Dutta ,Neeraj Kumar Sharma ,Srikanth Raj Chetupalli ,Pravin Mote ,Sriram Ganapathy ,Chandrakiran C ,Sahiti Nori ,Suhail K K ,Sadhana Gonuguntla ,and Murali Alagesan

Interspeech 2022, Incheon, South Korea.

Multimodal Transformer with Learnable Frontend and Self Attention for Emotion Recognition

Soumya Dutta and Sriram Ganapathy

ICASSP 2022, Singapore.

The Second DiCOVA Challenge: Dataset and performance analysis for Diagnosis of COVID-19 using acoustics

Neeraj Kumar Sharma, Srikanth Raj Chetupalli, Debarpan Bhattacharya, Debottam Dutta, Pravin Mote, and Sriram Ganapathy

ICASSP 2022, Singapore.

End-to-end speech recognition with joint dereverberation of sub-band autoregressive envelopes

Rohit Kumar, Anurenjan Purushothaman, Anirudh Sreeram, and Sriram Ganapathy

ICASSP 2022, Singapore.

Self Supervised Representation Learning with Deep Clustering for Acoustic Unit Discovery from Raw Speech

Varun Krishna PS and Sriram Ganapathy

ICASSP 2022, Singapore.

Self-Supervised Metric Learning With Graph Clustering For Speaker Diarization

P. Singh and S. Ganapathy

ASRU 2021, Cartagena.

Investigating the Feature Selection and Explainability of COVID-19 Diagnostics from Cough Sounds

A. Flavio, A. Poorjam, D. Mittal, C. Dognin, A. Muguli, R. Kumar, S. R. Chetupalli, S. Ganapathy and M. Singh

Interspeech 2021, Brno, Czech Republic.

LEAP Submission for the Third DIHARD Diarization Challenge

P. Singh, R. Varma, V. Krishnamohan, S. R. Chetupalli and S. Ganapathy

Interspeech 2021, Brno, Czech Republic.

SRIB-LEAP lab submission to Far-field Multi-Channel Speech Enhancement Challenge for Video Conferencing

P. R. Gudepu, R. Kumar, M. K. Jayesh, A. Purushothaman, S. Ganapathy and M. A. Basha

Interspeech 2021, Brno, Czech Republic.

The Third DIHARD Diarization Challenge

N. Ryant, P. Singh, V. Krishnamohan, R. Varma, K. Church, C. Cieri, J. Du, S. Ganapathy and M. Liberman

Interspeech 2021, Brno, Czech Republic.

DiCOVA Challenge: Dataset, task, and baseline system for COVID-19 diagnosis using acoustics

A. Muguli, L. Pinto, R. Nirmala, N. Sharma, P. Krishnan, P. Ghosh, R. Kumar, S. Bhat, S. R. Chetupalli, S. Ganapathy, S. Ramoji and V. Nanda

Interspeech 2021, Brno, Czech Republic.

A Multi-Head Relevance Weighting Framework for Learning Raw Waveform Audio Representations

D. Dutta, P. Agrawal, and S. Ganapathy

WASPAA 2021, New York, USA.

NISP: A Multi-lingual Multi-accent Dataset for Speaker Profiling

Shareef Babu Kalluri, Deepu Vijayasenan, Ganapathy, S., & Krishnan, P.

ICASSP 2021, Toronto.

Deep Multiway Canonical Correlation Analysis For Multi-Subject EEG Normalization

Katthi, J. R., & Ganapathy, S.

ICASSP 2021, Toronto.

End-to-End Lyrics Recognition with Voice to Singing Style Transfer

Basak, S., Agarwal, S., Ganapathy, S., & Takahashi, N.

ICASSP 2021, Toronto.

Representation Learning For Speech Recognition Using Feedback Based Relevance Weighting

P. Agrawal and S. Ganapathy

ICASSP 2021, Toronto, 2020.

Coswara - A Database of Breathing, Cough, and Voice Sounds for COVID-19 Diagnosis

N. Sharma, P. Krishnan, R. Kumar, S. Ramoji, S. R. Chetupalli, R. Nirmala, P. K. Ghosh and S. Ganapathy

Interspeech 2020, Beijing, October 2020

Neural PLDA Modeling for End-to-End Speaker Verification

S. Ramoji, P. Krishnan and S. Ganapathy

Interspeech 2020, Beijing, October 2020

Robust Raw Waveform Speech Recognition Using Relevance Weighted Representations

P. Agrawal and S. Ganapathy

Interspeech 2020, Beijing, October 2020.

Audiovisual Correspondence Learning in Humans And Machines

V. Krishnamohan, A. Soman, A. Gupta and S. Ganapathy

Interspeech 2020, Beijing, October 2020.

Deep Learning Based Dereverberation of Temporal Envelopes for Robust Speech Recognition

A. Purushothaman, A. Sreeram, R. Kumar and S. Ganapathy

Interspeech 2020, Beijing, October 2020.

Deep Self-Supervised Hierarchical Clustering for Speaker Diarization

P. Singh and S. Ganapathy

Interspeech 2020, Beijing, October 2020.

Context Dependent RNNLM for Automatic Transcription of Conversations

S. R. Chetupalli and S. Ganapathy

Interspeech 2020, Beijing, October 2020.

Deep Canonical Correlation Analysis For Decoding The Auditory Brain

J. Reddy and S. Ganapathy

IEEE EMBC, Toronto, Canada, July 2020.

NPLDA: A Deep Neural PLDA Model for Speaker Verification

S. Ramoji, P. Krishnan, and S. Ganapathy

Speaker Odyssey Workshop, November, 2020.

LEAP System for SRE19 Challenge - Improvements and Error Analysis

S. Ramoji, P. Krishnan, B. Mysore, P. Singh and S. Ganapathy

Speaker Odyssey Workshop, November, 2020.

On The Impact of Language Familiarity In Talker Change Detection

N. Sharma, V. Krishnamohan, S. Ganapathy, A. Gangopadhayay and L. Fink

ICASSP 2020.

3-D Feature and Acoustic Modeling for Far-Field Speech Recognition

A. Purushothaman, A. Sreeram and S. Ganapathy

ICASSP 2020.

Unsupervised Neural Mask Estimator For Generalized Eigen-Value Beamforming Based ASR

R. Kumar, A. Sreeram, A. Purushothaman and S. Ganapathy

ICASSP 2020.

Improving Voice Separation by Incorporating End-to-End Speech Recognition

N. Takahashi, M. Singh, S. Basak, P. Sudarsanam, S. Ganapathy, Y. Mitsufuji

ICASSP 2020.

Second Language Transfer Learning in Humans and Machines Using Image Supervision

K. Praveen, A. Gupta, A. Soman and S. Ganapathy

IEEE ASRU, Dec. 2019.

Speaker and Language Aware Training for End-to-End ASR

S. Bansal, K. Malhotra, S. Ganapathy

IEEE ASRU, Dec. 2019.

The Second DIHARD Diarization Challenge: Dataset - task - and baselines

N. Ryant, K. Church, C. Cieri, A. Cristia, J. Du, S. Ganapathy and M. Liberman

INTERSPEECH, Sept. 2019, Austria.

LEAP Diarization System for the Second DIHARD Challenge

P. Singh, Harsha Vardhan M A, S. Ganapathy and A. Kanagasundaram

INTERSPEECH, Sept. 2019, Austria.

Attention based Hybrid I-vector BLSTM Model for Language Recognition

B. Padi, A. Mohan and S. Ganapathy

INTERSPEECH, Sept. 2019, Austria.

Active Learning Methods for Low Resource End-To-End Automatic Speech Recognition

K. Malhotra, S. Bansal and S. Ganapathy

INTERSPEECH, Sept. 2019, Austria.

Unsupervised Raw Waveform Representation Learning for ASR

P. Agrawal and S. Ganapathy

INTERSPEECH, Sept. 2019, Austria.

A Study of X-vector Based Speaker Recognition on Short Utterances

A. Kanagasundaram, S. Sridharan, S. Ganapathy and P. Singh

INTERSPEECH, Sept. 2019, Austria.

The LEAP Speaker Recognition System for NIST SRE 2018 Challenge

S. Ramoji, A. Mohan, B. Mysore, A. Bhatia, P. Singh, Harsha Vardhan M A and S. Ganapathy

ICASSP, 2019.

Analyzing human reaction time for talker change detection

N. Sharma, S. Ganesh, S. Ganapathy and L. Holt

ICASSP, 2019.

Analyzing human reaction time for talker change detection

N. Sharma, S. Ganesh, S. Ganapathy and L. Holt

ICASSP, 2019.

Deep variational filter learning models for speech recognition

P. Agrawal and S. Ganapathy

ICASSP, 2019.

A Deep Neural Network Based End-to-End Model for Joint Height And Age Estimation From Short Duration Speech

Shareef Babu Kalluri, Deepu Vijayasenan and S. Ganapathy

ICASSP, 2019.

End-to-end language recognition using attention based hierarchical gated recurrent unit models

B. Padi, A. Mohan and S. Ganapathy

ICASSP, 2019.

Supervised i-vector modeling - Theory and Applications

S. Ramoji and S. Ganapathy

INTERSPEECH, 2018.

Comparison of unsupervised modulation filter learning methods for ASR

P. Agrawal and S. Ganapathy

INTERSPEECH, 2018.

PhaseNet: Discretized phase modeling with deep neural networks for audio source separation

N. Takahashi, P. Agrawal, N. Goswami and Y. Mitsufuji

INTERSPEECH, 2018.

Talker diarization in the wild: The case of child-centered daylong audio-recordings

A. Cristia, S. Ganesh, M. Casillas and S. Ganapathy

INTERSPEECH, 2018.

On Convolutional LSTM Modeling for Joint Wake-Word Detection and Text Dependent Speaker Verification

R. Kumar, V. Yeruva and S. Ganapathy

INTERSPEECH, 2018.

Far-Field Speech Recognition Using Multivariate Autoregressive Models

S. Ganapathy and M. Harish

INTERSPEECH, 2018.

The LEAP Language Recognition System for LRE 2017 Challenge - Improvements and Error Analysis

B. Padi, S. Ramoji, V. Yeruva, S. Kumar and S. Ganapathy

Odyssey: The speaker and language recognition workshop, 2018.

Leveraging LSTM Models for Overlap Detection in Multi-Party Meetings

N. Sajjan, S. Ganesh, N. Sharma, S. Ganapathy and N. Ryant

ICASSP, Calgary Canada, April 2018.

3-D CNN Models for Far-Field Multi-Channel Speech Recognition

S. Ganapathy and V. Peddinti

ICASSP, Calgary Canada, April 2018.

Enhancement and Analysis of Conversational Speech: JSALT 2017

N. Ryant et al.

ICASSP, Calgary Canada, April 2018.

Unsupervised HMM Posteriograms for Language Independent Acoustic Modeling in Zero Resource Conditions

Ansari T, R. Kumar, S. Singh and S. Ganapathy

IEEE ASRU, Dec. 2017.

Deep Learning Methods For Unsupervised Acoustic Modeling - LEAP Submission to ZeroSpeech Challenge 2017

Ansari T, R. Kumar, S. Singh, S. Ganapathy

IEEE ASRU, Dec. 2017.

Leveraging Native Language Speech For Accent Identfication Using Deep Siamese Networks

A. Siddhant, P. Jyothi and S. Ganapathy

IEEE ASRU, Dec. 2017.

Speech representation learning using unsupervised data-driven modulation filtering for robust ASR

P. Agrawal and S. Ganapathy

Interspeech, Stockholm, Sweden, Aug. 2017.

IITG-Indigo system for NIST 2016 SRE challenge

N. Kumar, R. K. Das, S. Jelil, Dhanush B K, H. Kashyap, K. S. R. Murthy, S. Ganapathy, R. Sinha and S. R. M. Prasanna

Interspeech, Stockholm, Sweden, Aug. 2017.

Factor Analysis Methods for Joint Speaker Verification and Spoof Detection

Dhanush B, Suparna S., Aarthy R., Likhita C., Shashank D., Harish H. and S. Ganapathy

ICASSP, New Orleans, USA, 2017.

The IBM Speaker Recognition System: Recent Advances and Error Analysis

S. Sadjadi, J. Pelecanos and S. Ganapathy

Interspeech, San Francisco, September, 2016.

An investigation on the use of ivectors for improved ASR robustness

D. Dimitriadis, S. Thomas and S. Ganapathy

Interspeech, San Francisco, Sept. 2016.

The IBM 2016 Speaker Recognition System

S. Sadjadi, S. Ganapathy and J. Pelecanos

Odyssey, Spain, June, 2016.

Speaker Age Estimation On Conversational Telephone Speech Using Senone Posterior Based I-vectors

S. Sadjadi, S. Ganapathy and J. Pelecanos

ICASSP, Shanghai, March, 2016.

Investigating Factor Analysis Features for Deep Neural Networks In Noisy Speech Recognition

S. Ganapathy, S. Thomas, D. Dimitriadis, S. Rennie

Interspeech, Dresden, Germany, Sept. 2015.

Robust Speech Processing Using ARMA Spectrograms

S. Ganapathy

ICASSP, Brisbane, April, 2015.

Nearest Neighbor Discriminant Analysis for Language Recognition

S. Sadjadi, J. Pelecanos and S. Ganapathy

ICASSP, Brisbane, April, 2015.

Robust Language Identification Using Convolutional Neural Networks

S. Ganapathy, K. J. Han, S. Thomas, M. Omar, M. V. Segbroeck and S. Narayanan

Interspeech, Singapore, Sept. 2014.

Shift-Invariant Features for Speech Activity Detection in Adverse Radio-Frequency Channel Conditions

M. Omar and S. Ganapathy

ICASSP, Florence, Italy, May, 2014.

Analyzing Convolutional Neural Networks for Speech Activity Detection in Mismatched Acoustic Conditions

K. J. Han, S. Ganapathy, M Li, M. Omar and S. Narayanan

ICASSP, Florence, Italy, May, 2014.

The IBM Speech Activity Detection System for the DARPA RATS Program

G. Saon, S. Thomas, H. Soltau, S. Ganapathy and B. Kingsbury

Interspeech, Lyon, Aug. 2013.

TRAP Language Identification System for RATS Phase II Evaluation

K. J. Han, S. Ganapathy, M Li, M. Omar and S. Narayan

Interspeech, Lyon, Aug. 2013.

Robust Speaker Recognition Using Spectro-Temporal Autoregressive Models

H. Mallidi, S. Ganapathy and H. Hermansky

Interspeech, Lyon, Aug. 2013.

Unsupervised Channel Adaptation For Language Identification Using Co-training

S. Ganapathy, M. Omar and J. Pelecanos

ICASSP, Vancouver, May, 2013.

Noisy Channel Adaptation in Language Identification

S. Ganapathy, M. Omar and J. Pelecanos

IEEE SLT, Miami, Dec, 2012.

Robust Phoneme Recognition Using High Resolution Temporal Envelopes

S. Ganapathy and H. Hermansky

Interspeech, Portland, Sept. 2012.

Data-driven Posterior Features for Low Resource Speech Recognition Applications

S. Thomas, S. Ganapathy, A. Jansen and H. Hermansky

Interspeech, Portland, Sept. 2012.

Feature Extraction Using 2-D Autoregressive Models For Speaker Recognition

S. Ganapathy, S. Thomas and H. Hermansky

ISCA Speaker Odyssey, June 2012.

Adaptation Transforms of Auto-Associative Neural Networks as Features for Speaker Verification

S. Thomas, H. Mallidi, S. Ganapathy and H. Hermansky

ISCA Speaker Odyssey, June 2012.

The UMD-JHU 2011 Speaker Recognition System

D. Gomero et al.

ICASSP, Japan, Mar. 2012.

Multilingual MLP Features For Low-resource LVCSR Systems

S. Thomas, S. Ganapathy and H. Hermansky

ICASSP, Japan, Mar. 2012.

Multi-layer Perceptron Based Speech Activity Detection for Speaker Verification

S. Ganapathy, P. Rajan and H. Hermansky

IEEE WASPAA, Oct. 2011.

Modulation spectrum analysis for recognition of reverberant speech

H. Mallidi, S. Ganapathy and H. Hermansky

Interspeech, Italy, Aug. 2011.

Feature Normalization for Speaker Verification in Room Reverberation

S. Ganapathy, J. Pelecanos and M. Omar

ICASSP, Prague, May 2011.

Sparse Auto-associative Neural Networks: Theory and Application to Speech Recognition

S. Garimella, S. Ganapathy and H. Hermansky

Interspeech, Japan, Sept. 2010.

Cross-lingual and Multi-stream Posterior Features for Low-resource LVCSR Systems

S. Thomas, S. Ganapathy and H. Hermansky

Proc. of Interspeech, Japan, Sept. 2010.

A Phoneme Recognition Framework based on Auditory Spectro-Temporal Receptive Fields

S. Thomas, K. Patil, S. Ganapathy, N. Mesgarani, H. Hermansky

Proc. of Interspeech, Japan, Sept. 2010.

Robust Spectro-Temporal Features Based on Autoregressive Models of Hilbert Envelopes

S. Ganapathy, S. Thomas and H. Hermansky

ICASSP, Dallas, USA, March 2010.

Comparison of Modulation Features For Phoneme Recognition

S. Ganapathy, S. Thomas and H. Hermansky

ICASSP, Dallas, USA, March 2010.

Temporal Envelope Subtraction for Robust Speech Recognition Using Modulation Spectrum

S. Ganapathy, S. Thomas, and H. Hermansky

IEEE ASRU, 2009.

Applications of Signal Analysis Using Autoregressive Models for Amplitude Modulation

S. Ganapathy, S. Thomas, P. Motlicek and H. Hermansky

IEEE WASPAA 2009.

Static and Dynamic Modulation Spectrum for Speech Recognition

S. Ganapathy, S. Thomas and H. Hermansky

Proc. of Interspeech, Brighton, UK, Sept. 2009.

Tandem Representations of Spectral Envelope and Modulation Frequency Features for ASR

S. Thomas, S. Ganapathy and H. Hermansky

Proc. of Interspeech, Brighton, UK, Sept. 2009.

Phoneme Recognition Using Spectral Envelope and Modulation Frequency Features

S. Thomas, S. Ganapathy and H. Hermansky

ICASSP, Taiwan, April 2009.

Front-end for Far-field Speech Recognition based on Frequency Domain Linear Prediction

S. Ganapathy, S. Thomas and H. Hermansky

Proc. of INTERSPEECH, Brisbane, Australia, Sep 2008.

Hilbert Envelope Based Specto-Temporal Features for Phoneme Recognition in Telephone Speech

S. Thomas, S. Ganapathy and H. Hermansky

Proc. of INTERSPEECH, Brisbane, Australia, Sep 2008.

Spectral Noise Shaping: Improvements in Speech/Audio Codec Based on Linear Prediction in Spectral Domain

S. Ganapathy, P. Motlicek, H. Hermansky and H. Garudadri

Proc. of INTERSPEECH, Brisbane, Australia, Sep 2008.

Perceptually motivated Sub-band Decomposition for FDLP Audio Coding

P. Motlicek, S. Ganapathy, H. Hermansky, H. Garudadri and Marios Athineos

Lecture Notes In Artificial Intelligence, Springer-Verlag Berlin, Heidelberg, 2008.

Spectro-Temporal Features for Automatic Speech Recognition using Linear Prediction in Spectral Domain

S. Thomas, S. Ganapathy and H. Hermansky

Proc. of EUSIPCO, Lausanne, Switzerland, Aug 2008.

Autoregressive Modelling of Hilbert Envelopes for Wide-band Audio Coding

S. Ganapathy, P. Motlicek, H. Hermansky and H. Garudadri

AES 124th Convention, Audio Engineering Society, May 2008.

Temporal Masking for Bit-rate Reduction in Audio Codec Based on Frequency Domain Linear Prediction

S. Ganapathy, P. Motlicek, H. Hermansky and H. Garudadri

Proc. of ICASSP, April 2008.

Hilbert Envelope Based Features for Far-Field Speech Recognition

S. Thomas, S. Ganapathy and H. Hermansky

Lecture Notes in Computer Science, Springer Berlin, Heidelberg 2008.

Frequency Domain Linear Prediction for QMF Sub-bands and Applications to Audio Coding

P. Motlicek, H. Hermansky, S. Ganapathy and H. Garudadri

Lecture Notes in Computer Science, Springer Berlin, Heidelberg 2007.

Non - Uniform Speech/Audio Coding Exploiting Predictability of Temporal Evolution of Spectral Envelopes

P. Motlicek, H. Hermansky, S. Ganapathy and H. Garudadri

Lecture Notes in Computer Science, Springer Berlin, Heidelberg 2007.

Patents

Adaptive System Combination in Language Recognition

Oct. 2018.

Spectral Noise Shaping in Audio Coding Based on Spectral Dynamics in Frequency Sub-bands

Nov. 2011.

Temporal Masking in Audio Coding Based on Spectral Dynamics in Frequency Sub-bands

Aug. 2009.