The Centre for Acoustic Signal Processing Research (CASPR) is a centre at the Section for Artificial Intelligence and Sound (AIS), Department of Electronic Systems, Aalborg University, Denmark.

[28th August, 2024]

The August 2024 newsletter from our Centre for Acoustic Signal Processing Research (CASPR) has now been released.

[August 2024] Papers of CASPR members:

  • Joint Far- and Near-end Speech and Listening Enhancement with Minimum Processing. A. J. Fuglsig, Z.-H. Tan, L. S. Bertelsen, J. Jensen, J. C. Lindof, and J. Østergaard, IEEE Access, 2024.
  • The Effect of Training Dataset Size on Discriminative and Diffusion-Based Speech Enhancement Systems. P. Gonzalez, Z.-H. Tan, J. Østergaard, J. Jensen, T. S. Alstrøm, and T. May, IEEE Signal Processing Letters, 2024.
  • Generating Accurate and Diverse Audio Captions through Variational Autoencoder Framework. Y. Zhang, R. Du, Z.-H. Tan, W. Wang, and Z. Ma, IEEE Signal Processing Letters, 2024.
  • How to train your ears: Auditory-model emulation for large-dynamic-range inputs and mild-to-severe hearing losses. P. A. L. Bysted, J. Jensen, Z.-H. Tan, J. Østergaard, and L. Bramsløw, IEEE/ACM Transactions on Audio, Speech and Language Processing, vol. 32, pp. 2006-2020, 2024.
  • Near-end Listening Enhancement Using a Noise-Robut Linear Time-Invariant Filter. F. Villani, W.-Y. Chan, Z.-H. Tan, J. Østergaard, and J. Jensen, The 18th International Workshop on Acoustic Signal Enhancement (IWAENC 2024), Aalborg, Denmark, September 9-12, 2024.
  • Audio Mamba: Selective State Spaces for Self-Supervised Audio Representations. S. Yadav and Z.-H. Tan, Interspeech 2024, Kos Island, Greece, September 1-5, 2024.
  • Speaker and Style Disentanglement of Speech Based on Contrastive Predictive Coding Supported Factorized Variational Autoencoder. Y. Xie, M. Kuhlmann, F. Rautenberg, Z.-H. Tan, and R. Haeb-Umbach, The 32nd European Signal Processing Conference (EUSIPCO 2024), Lyon, France, August 26–30, 2024.
  • Envelope Based Deep Source Separation and EEG Auditory Attention Decoding for Speech and Music. A. Tanveer, J. Jensen, Z.-H. Tan, and J. Østergaard, The 32nd European Signal Processing Conference (EUSIPCO 2024), Lyon, France, August 26–30, 2024.

[July 2024] Papers of CASPR members:

  • PAC-Bayesian Error Bound, via Rényi Divergence, for a Class of Linear Time-Invariant State-Space Models. D. Eringis, J. Leth, Z.-H. Tan, R. Wisniewski, and M. Petreczky, The 41st International Conference on Machine Learning (ICML 224), Vienna, Austria, July 21-27, 2024.

[June 2024]

CASPR at Folkemødet 2024

On Friday 14th of June, the PhD student Holger Severin Bovbjerg represented CASPR at Folkemødet which takes place in Allinge on the Danish Island Bornholm.

Here, 50 researchers from Danish universities, including,Danish Nobel Prize winner Morten Meldal, were invited to speak at Folkemødets Forskningsscene (Folkemødet’s Research Scene), with the purpose of educating the public on current research.

Holger was invited to speak about personalized AI speech models and in his talk, he explained how personalized AI speech models work in layperson’s terms.

He also presented examples of some of their interesting applications, such as specialized hearing aids and early detection of various illnesses.

Fulbright scholarship granted for PhD student Holger Severin Bovbjerg

The CASPR PhD student, Holger Severin Bovbjerg, has been granted the prestigious Fulbright scholarship to support his coming research stay in the U.S. at Carnegie Mellon University in Pittsburgh.

Fulbright is an international academic exchange program sponsored by the U.S. government to promote scientific and cultural exchange between the U.S. and other countries. Since 1951 between the Fulbright agreement between Denmark and the U.S.A over 3.500 Danes and Americans have participated in the Fulbright program.

On June 20th, Holger visited “Rydhave”, the residence of the United States’ Ambassador to Denmark, where he was invited to attend a reception for Danish Fulbright grantees and met other Fulbrighters from many differenty academic fields.

During his stay in the U.S. Holger will serve as a Fulbright ambassador for Denmark.

[June 2024] Papers of CASPR members:

  • Complex Recurrent Variational Autoencoder for Speech Resynthesis and Enhancement. Y. Xie, T. Arildsen, and Z.-H. Tan, IEEE World Congress on Computational Intelligence (IEEE WCCI 2024), Yokohama, Japan, June 30-July 5, 2024.

[May 2024] Papers of CASPR members:

  • Masked Autoencoders with Multi-Window Local-Global Attention Are Better Audio Learners. S. Yadav, S. Theodoridis, L. K. Hansen, and Z.-H. Tan, The Twelfth International Conference on Learning Representations (ICLR 2024), Vienna, Austria, May 7-11, 2024.

[April 2024]

CASPR at ICASSP 2024 in Seoul, Korea

CASPR members and research visitors published the following papers in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2024) in Seoul, South Korea:

CASPR members published the following papers in IEEE Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA 2024) in Seoul, South Korea:

  • Joint Minimum Processing Beamforming and Near-End Listening Enhancement. A. J. Fuglsig, J. Jensen, Z.-H. Tan, L. S. Bertelsen, J. C. Lindof, J. Østergaard, Proc. HSCMA 2024.
  • Deep Low-Latency Joint Speech Transmission and Enhancement over a Gaussian Channel, Mohammad Bokaei, Jesper Jensen, Simon Doclo, Jan Østergaard, Proc. HSCMA 2024.

Vasudha Sathyapriyan, Holger S. Bovbjerg, and Kateřina Žmolíková before their poster sessions at ICASSP 2024.

[20 March, 2024] 

CASPR held a week-long Winter School on Signal Processing for Hearing Assistive Devices. The participants were from industry and university, and it was great to see how they worked together to solve problems related to acoustic beam forming, among others. 

We thank the participants for their impressive engagement in the lectures, question rounds, and exercises. We also thank all the lectures for their great contributions, which made the Winter School really exciting and relevant. In particular, thanks to:

  • Prof. Steven van de Par, Oldenburg University
  • Prof. Jesper Jensen, Aalborg University and Oticon.
  • Prof. Zheng-Hua Tan, Aalborg University
  • Dr. Meng Guo, Oticon.
  • Dr. Robert Rehr, Oticon.
  • Dr. Michael Syskind Pedersen, Oticon.
  • Dr. Dorothea Wendt, Eriksholm Research Center.
  • Postdoc Payam Shahsavari, Aalborg University
  • PhD student Vasudha Sathyapriyan, Aalborg University and Oticon
  • PhD student Andreas Fuglsig, Aalborg University
  • PhD student Peter Leer Bysted, Aalborg University
  • PhD student Philippe Gonzalez , Technical University of Denmark
  • PhD student Asjid Tanveer, Aalborg University
  • PhD student Sangeeth G. Jayaprakash, Aalborg University
  • PhD student Holger S. Bovbjerg, Aalborg University
  • PhD student Mohammad Bokaei, Aalborg University

[15th January, 2024]

The January 2024 newsletter from our Centre for Acoustic Signal Processing Research (CASPR) has now been released.

If you are interested in Acoustic Signal Processing then check out our Winter School.

CASPR is a research centre at the Section for Artificial Intelligence & Sound, Department of Electronic Systems, Aalborg University, Denmark. CASPR is primarily supported by the Demant Foundation, Oticon A/S, and Aalborg University.

[14th November, 2023] 

The Centre for Acoustic Signal Processing Research (CASPR) at Aalborg University is happy to announce the 2024 CASPR Course on Signal Processing for Hearing Assistive Devices

Signal processing for hearing assistive devices – Aalborg University (

The course will be in-person (physical face-to-face) and take place at the AAU Campus in Copenhagen, Denmark, during three consecutive days. Online participation will not be possible. The three days will cover teaching, presentations, hands-on practical training, and networking. A diploma will be issued on successful completion of the 3-day program.

As an inspiration and for your further development, CASPR provides a “Research talks on emerging technologies” event the following two days after the course, where participation is optional. University researchers will give the talks – furthermore, you will have the opportunity to network with other participants and experts in sound, signal processing, and machine learning.

Brief course outline:

Hearing assistive devices (HADs) are ubiquitous. They include, for example, devices such as headsets for speech communication in noisy environments (airplane crews, emergency/rescue teams, combat soldiers, police forces, etc.), headsets for office use, gaming, etc., and hearing care systems, e.g., hearing aids and cochlear implants.

The course consists of lectures and hands-on exercises, allowing the participants to understand in-depth the technical problems related to HADs and their potential solutions. The multi-disciplinary course focuses on applying theoretical results to real-world problems and practical do’s and don’ts.

The first part of the course is a short introductory part, which lays the foundation for the rest of the course, covering fundamental topics such as auditory perception (normal and impaired hearing) and a discussion of the basic principles of HADs.

The second part provides an overview of fundamental signal processing problems encountered in HADs, and an in-depth treatment of state-of-the-art solutions. These include beamforming and noise reduction methods, direction-of-arrival estimation, voice activity detection, feedback control, hearing loss compensation, etc. Furthermore, an overview is given of important methodologies for evaluating HADs related to speech intelligibility and listening effort.

The last two – optional – days are devoted to lectures on emerging technologies for hearing assistive devices. These include deep learning based methods for multi-modal HAD processing, including EEG and sound, methods for listening effort and attention decoding, generative speech enhancement methods, hearing loss compensation methods, self-supervised deep learning methods, HADs involving microphones outside the ear of the user, and methods for low-latency enhancement and communication.

While the course focuses on applications, many of the discussed techniques are general and find use in the much broader field of general sound processing.


  • Prof. Jesper Jensen, Aalborg University and Oticon
  • Prof. Jan Østergaard, Aalborg University.
  • Prof. Zheng-Hua Tan, Aalborg University.

Dates for the mandatory course part (2024 CASPR Course on Signal Processing for Hearing Assistive Devices):
Monday 26th – Wednesday 28th, February 2024.

Dates for the additional optional part (Emerging Topics in Signal Processing for Hearing Assistive Devices):
Thursday 29th February – Friday 1st March 2024.

Place: Copenhagen, AAU Campus. A.C. Meyers Vænge 15, 2450 Copenhagen.

Registration fee (industrial participants): 13.748 DKK / 1845 Euros (including lunch).

Registration is now open:

Max number of participants: 30

Course Teaser (PDF): Signal Processing for Hearing Assistive Devices


  • Prof. Steven van de Par, Oldenburg University
  • Prof. Jesper Jensen, Aalborg University and Oticon
  • Prof. Zheng-Hua Tan, Aalborg University
  • Dr. Meng Guo, Oticon.
  • Dr. Robert Rehr, Oticon.
  • Dr. Michael Syskind Pedersen, Oticon.
  • Dr. Dorothea Wendt, Eriksholm Research Center.
  • Postdoc Payam Shahsavari, Aalborg University
  • PhD student Vasudha Sathyapriyan, Aalborg University and Oticon
  • PhD student Andreas Fuglsig, Aalborg University
  • PhD student Peter Leer Bysted, Aalborg University
  • PhD student Philippe Gonzalez , Technical University of Denmark
  • PhD student Asjid Tanveer, Aalborg University
  • PhD student Sangeeth G. Jayaprakash, Aalborg University
  • PhD student Holger S. Bovbjerg, Aalborg University
  • PhD student Mohammad Bokaei, Aalborg University

Prerequisite for participation:
The course requires a background in signal processing and statistics. It is expected that you have qualifications corresponding to a M.Sc. in e.g., signal processing.

We expect the participants in the course to consist of R&D engineers from industry together with PhD students from universities.


Before the course starts, you will receive a link to the literature. It is not mandatory to prepare before the course. The course will primarily be in English.


For further information about the course, contact:

  • Jan Østergaard (
  • Jesper Jensen (

[September 2023] Daniel Michelsanti received the “Best Young Italian Researcher in Denmark” (BIRD) award, in the category of Physical and Engineering Sciences, for his work on audio-visual speech enhancement based on deep learning, conducted at Aalborg University and Oticon. The ceremony was officiated by the Ambassador of Italy to Denmark, Stefania Rosini.

[August 2023] Zheng-Hua Tan, Achintya kr. Sarkar, and Najim Dehak received International Speech Communication Association (ISCA) Best Research Paper Award for their paper “rVAD: An Unsupervised Segment-Based Robust Voice Activity Detection Method,” Computer Speech & Language, vol. 59, January 2020. The award ceremony took place at the INTERSPEECH conference in Dublin, Ireland, August 2023. See details.

[June 2023] Morten Kolbæk, Dong Yu, Zheng-Hua Tan, and Jesper Jensen received the 2022 IEEE Signal Processing Society Best Paper Award for their paper “Multitalker Speech Separation with Utterance-Level Permutation Invariant Training of Deep Recurrent Neural Networks,” IEEE/ACM Trans. Audio, Speech, and Language Processing, Oct. 2017.   From the award ceremony at the IEEE ICASSP conference in Rhodes Island, June 2023: Zheng-Hua Tan (left) and Jesper Jensen (right) together with the chair for the IEEE Signal Processing Society awards committee, Sergios Theodoridis. See details.

[May 2023] Phd defense of Mathias Bach Pedersen. In the photo, Dr. Mathias Bach Pedersen, assessment committee and supervisors. From left: Prof. Fei Chen, Prof. Steven van de Par, Supervisor Dr. Asger Heidemann Andersen, Supervisor Prof. Jesper Jensen, Dr. Mathias Bach Pedersen, Dr. Christian Sejer Pedersen, Supervisor Prof. Zheng-Hua Tan, Prof. Dorte Hammershøi.

[March 2023] Phd defense of Poul Hoang. In the photo, Dr. Poul Hoang, assessment committee and supervisors. From left: Supervisor Prof. Jesper Jensen, Prof. Søren Bech, Prof. Sharon Gannot, Dr. Poul Hoang, Prof. Reinhold Haeb-Umbach, Supervisor Prof. Zheng-Hua Tan, Supervisor Dr. Jan Mark de Haan.

 [March, 2023] Payam Shahsavari Baboukani successfully defended his phd thesis entitled “From Global to Local Functional Connectivity: Application to Listening Effort”. From left: Prof. Maria Chait, Supervisor Dr. Emina Alickovic, Supervisor Assist. Prof. Carina Graversen, Dr. Payam Shahsavari Baboukani, Supervisor Prof. Jan Østergaard, Assoc. Prof. Carles Navarro Manchon, and Prof. Preben Kidmose.

[March 2023] Papers of CASPR members:

  • On the Deficiency of Intelligibility Metrics as Proxies for Subjective Intelligibility. I. López-Espejo, A. Edraki, W.-Y. Chan, Z.-H. Tan, and J. Jensen. Elsevier Speech Communication (accepted), 2023.

[February 2023] Papers of CASPR members:

  • Filterbank Learning for Noise-Robust Small-Footprint Keyword Spotting. I. López-Espejo, R. C. M. C. Shekar, Z.-H. Tan, J. Jensen, and J. H. L. Hansen. ICASSP (accepted), 2023.
  • Fearless Steps APOLLO: Challenges in keyword spotting and topic detection for naturalistic audio streams. A. Joglekar, I. López-Espejo, and J. H. L. Hansen. The 184th Meeting of the Acoustical Society of America (accepted), 2023.

[January 2023] We have one or more fully-funded 3-year PhD positions available in CASPR. Apply online using the following link:

[November, 2022] New researcher in CASPR:

  • On November 1st 2022, Holger Severin Bovbjerg started a PhD project entitled “Self-supervised learning for Speech Source Detection”, supervised by CASPR members

[September 2022] Papers of CASPR members:

  • An Experimental Study on Light Speech Features for Small-Footprint Keyword Spotting. I. López-Espejo, Z.-H. Tan, and J. Jensen. IberSPEECH (accepted), 2022.
  • Fusion of Classical Digital Signal Processing and Deep Learning Methods (FTCAPPS). A. Gomez, V. E. Sánchez, A. Peinado, J. M. Martín-Doñas, A. Gómez-Alanís, A. Villegas-Morcillo, E. Rosello, M. Chica, C. Garcia, and I. López-Espejo. IberSPEECH (accepted), 2022.
  • Utilization of acoustic signals with generative Gaussian and autoencoder modeling for condition-based maintenance of injection moulds. G. Ø. Rønsch, I. López-Espejo, D. Michelsanti, Y. Xie, P. Popovski, and Z.-H. Tan. International Journal of Computer Integrated Manufacturing (accepted), 2022.

[August 2022] We have a fully-funded 3-year industrial PhD position offered by Oticon A/S and Dept. Electronic Systems, Aalborg University. Welcome to apply. Industrial PhD Fellowship – audio signal processing & machine learning for hearing assistive devices (

[August 2022] We are happy to announce that the August 2022 Newsletter of CASPR has now been released. You can obtain a copy from our website: August 2022 CASPR Newsletter

[June 2022] Papers of CASPR members:

  • iVAE-GAN: Identifiable VAE-GAN Models for Latent Representation Learning. IEEE Access, vol. 10, pp. 48405-48418, 2022.
  • Adversarial Multi-Task Deep Learning for Noise-Robust Voice Activity Detection with Low Algorithmic Delay. C. M. Larsen, P. Koch and Z.-H. Tan. Interspeech 2022, September 18-22, Incheon, Korea.

[May 2022] Prof. Zheng-Hua Tan gave an invited talk entitled “Self-Supervised Learning: Training Targets and Loss Functions” and co-moderated a panel on “Data Science Education: The Signal Processing Perspective” at ICASSP 2022, Singapore, May 22-27, 2022.

[May 2022] PhD researchers Mohammad Bokaei, Andreas J. Fuglsig, Payam Baboukani, and Professor Zheng-Hua Tan enjoyed physically attending the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), which was held in Singapore, May 22 – 27, 2022.

[May 2022] Papers of CASPR members:

  • Training Data-Driven Speech Intelligibility Predictors on Heterogeneous Listening Test Data. M. B.  Pedersen, A. H. Andersen, S. H. Jensen, Z.-H. Tan and J. Jensen. IEEE Access, vol. 10, pp. 66175-66189, 2022.
  • A neural network framework for modelling parameterized auditory models, P. A. L. Bysted, J. Jensen, Z.-H. Tan, J. Østergaard, L. Bramsløw, Proc. Baltic Nordic Acoustic Meeting (BNAM) 2022 – Joint Acoustics Conference, May 2022.

[April 2022] One or more fully funded PhD stipends are available in the Centre for Acoustic Signal Processing Research (CASPR).

The potential topics of the PhD stipends are:

  1.  Self supervised learning for signal processing in hearing assistive devices such as speech enhancement, acoustic scene analysis, etc.
  2. Energy reduced DNNs using Bayesian learning and approximate computing such as finite precision arithmetic, deterministic and stochastic computing, etc.
  3. Multimodal signal processing for hearing assistive devices such as users EEG, eye-gaze as well as audio-visual inputs, etc.
  4. Advanced speech enhancement including voice conversion, person conditioned DNN, binaural enhancement, etc.

When applying for the position you need to indicate your first and second priority of research topics from the above list.

We are looking for highly motivated, independent, and outstanding students that desire to do a successful 3-year PhD programme at Aalborg University. The ideal candidate will have a M.Sc. degree in statistical signal processing, audio processing, auditory perception, machine learning, or information theory. Good English verbal and written skills are a must, and excellent undergraduate and graduate grades are desired.

For more information and to apply for the positions use the following link:

[April 2022] Prof. Zheng-Hua Tan is appointed as a member of the IEEE Signal Processing Society Conferences Board for the term 2022-2024. He is also a member of the IEEE Signal Processing Society Technical Directions Board.

[March 2022] Papers of CASPR members:

  • Shouted and Whispered Speech Compensation for Speaker Verification Systems. S. Prieto, A. Ortega, I. López-Espejo, and E. Lleida. Elsevier Digital Signal Processing (accepted), 2022.
  • Joint Far-and Near-End Speech Intelligibility Enhancement based on the Approximated Speech Intelligibility Index. A.J. Fuglsig, J. Østergaard, J. Jensen, L.S. Bertelsen, P. Mariager, and Z.-H. Tan. ICASSP 2022.
  • A Stimuli-Relevant Directed Dependency Index for Time Series. Payam Shahsavari Baboukani, Sergios Theodoridis, Jan Østergaard. ICASSP 2022.

[February 2022] Papers of CASPR members:

  • The Minimum Overlap-Gap Algorithm for Speech Enhancement. P. Hoang, Z.-H. Tan, J.-M. de Haan, J. Jensen, IEEE Access, February 2022.

[January 2022] Papers of CASPR members:

  • Multichannel Speech Enhancement with Own Voice-Based Interfering Speech Suppression for Hearing Assistive Devices. P. Hoang, J. M. de Haan, Z.-H. Tan, and J. Jensen. IEEE Trans. Audio, Speech, Language Processing, January 2022.

[December 2021] Papers of CASPR members:

  • Deep Spoken Keyword Spotting: An Overview. I. López-Espejo, Z.-H. Tan, J. Hansen, and J. Jensen. IEEE Access (accepted), 2021.

[October 2021] To celebrate the continuation of CASPR, we invite you to an Open Lab Tour, poster demonstrations and networking, on Monday November 29th from 13.00 – 16.00. For more details, please refer to CASPR-II_Opening Program.

The event is free of charge – please register to Inge Marie Pedersen no later than November 22, 2021.

[August 2021] Papers of CASPR members:

  • Compression of DNNs Using Magnitude Pruning and Nonlinear Information Bottleneck Training. M. Ø. Nielsen, J. Østergaard, J. Jensen, Z.-H. Tan, Proc. of IEEE 31st International Workshop on Machine Learning for Signal Processing (MLSP2021). Accepted.
  • Disentangled Speech Representation Learning Based on Factorized Hierarchical Variational Autoencoder with Self-Supervised Objective. Y. Xie, T. Arildsen, Z.-H. Tan, Proc. of IEEE 31st International Workshop on Machine Learning for Signal Processing (MLSP2021). Accepted.

[August 2021] We are happy to announce that the 9th Newsletter of CASPR has now been released.  You can obtain a copy from our website:

[June 2021] CASPR members will attend the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2021) presenting the following articles:

  • Joint Maximum Likelihood Estimation of Power Spectral Densities and Relative Acoustic Transfer Functions for Acoustic Beamforming. P. Hoang, Z.-H. Tan, J. M. de Haan, J. Jensen.
  • Audio-Visual Speech Inpainting with Deep Learning. G. Morrone, D. Michelsanti, Z.-H. Tan, J. Jensen.

Daniel Michelsanti, Zheng-Hua Tan, Jesper Jensen, and Dong Yu will also give a tutorial presentation on Audio-Visual Speech Enhancement and Separation Based on Deep Learning.

[June 2021] Daniel Michelsanti and Zheng-Hua Tan have been invited as speakers for a half-day training session on Audio-Visual Speech Enhancement and Separation Based on Deep Learning at the Open Data Science Conference (ODSC) Europe 2021.

[June 2021]  Vasudha Sathyapriyan started a Ph.D project entitled “Vision-Assisted Hearing Aid Systems” supervised by CASPR members. The project is part of the EU-ITN project Service-Oriented Ubiquitous Network-Driven Sound (

[June 2021] Papers of CASPR members:

  • A Novel Loss Function and Training Strategy for Noise-Robust Keyword Spotting. I. López-Espejo, Z.-H. Tan, and J. Jensen. IEEE Transactions on Audio, Speech, and Language Processing (accepted), 2021.
  • Speech Intelligibility Prediction Using Spectro-Temporal Modulation Analysis. A. Edraki, W.-Y. Chan, J. Jensen, and D. Fogerty. IEEE Trans. Audio, Speech, Lang. Process. Vol. 29, pp. 210 – 225, 2021.
  • A Spectro-Temporal Glimpsing Index (STGI) for Speech Intelligibility Prediction. A. Edraki, W.-Y. Chan, J. Jensen, and D. Fogerty. Proc. Interspeech, 2021 (accepted).

[May 2021] Prof. Zheng-Hua Tan will participate in The Pioneer Center for Artificial Intelligence, a new large research centre in Denmark focusing on artificial intelligence. The Center is to launch by the end of 2021 with funding from five foundations and participation of five universities. For details refer to the news article (in Danish).

[May 2021] On May 17-21, 2021, CASPR organized the online CASPR Summer School on Signal Processing for Hearing Assistive Devices.

  • Around 38 participants were enrolled in the Summer School, with participant affiliations equally divided between industry (hearing aids, head sets, communication, etc) and university (Denmark, Czech Republic, Italy, The Netherlands, UK).
  • The Summer School consisted of lectures by invited experts, lectures by CASPR staff, and theoretical/practical exercises. Lectures covered topics such as basic auditory perception, fundamental signal processing problems in hearing assistive devices and state-of- the-art solutions (hearing loss compensation, beamforming, feedback cancellation, etc.) and emerging topics (audio-visual methods, EEG-based methods, personalization, listening effort, etc.).

[May 2021] Papers of CASPR members:

  • Advanced Dropout: A Model-free Methodology for Bayesian Dropout Optimization. J. Xie, Z. Ma, G. Zhang, J.-H. Xue, Z.-H. Tan and J. Guo. Accepted by IEEE Transactions on Pattern Analysis and Machine Intelligence.
  • Self-Segmentation of Pass-Phrase Utterances for Deep Feature Learning in Text-Dependent Speaker Verification. A. k. Sarkar and Z.-H. Tan. Accepted by Computer Speech & Language.
  • Vocal Tract Length Perturbation for Text-Dependent Speaker Verification with Autoregressive Prediction Coding. A. k. Sarkar, Z.-H. Tan. Accepted by IEEE Signal Processing Letters.
  • Online Multichannel Speech Enhancement Based on Recursive EM and DNN-based Speech Presence Estimation. J. M. Martín-Doñas, J. Jensen, Z.-H. Tan, A. M. Gomez, and A. M. Peinado. Accepted by IEEE/ACM Transactions on Audio, Speech and Language Processing.

[April 2021] Daniel Michelsanti started an industrial postdoc project entitled “Vision-Assisted Hearing Aid Systems” supervised by CASPR members.

[March 2021] Daniel Michelsanti, Prof. Zheng-Hua Tan and Prof. Jesper Jensen, together with Dr. Dong Yu, are going to give a tutorial talk at IEEE ICASSP 2021 on audio-visual speech enhancement and separation based on deep learning.

[March 2021] Papers of CASPR members:

  • Audio-Visual Speech Inpainting with Deep Learning. G. Morrone, D. Michelsanti, Z.-H. Tan and J. Jensen. Accepted by IEEE ICASSP, 2021.
  • Joint Maximum Likelihood Estimation Of Power Spectral Densities And Relative
    Acoustic Transfer Functions For Acoustic Beamforming, P. Hoang, Z.-H. Tan, J. M. de Haan,  J. Jensen. Accepted by IEEE ICASSP, 2021.
  • An Overview of Deep-Learning-Based Audio-Visual Speech Enhancement and Separation. D. Michelsanti, Z.-H. Tan, S.-X. Zhang, Y. Xu, M. Yu, D. Yu, and J. Jensen. Accepted by IEEE/ACM Transactions on Audio, Speech and Language Processing, 2021.

[March 2021] William Demant Foundation, Oticon A/S, and Aalborg Universit will support a 5-year continuation of CASPR.

[January 2021] Papers of CASPR members:

Dual-channel eKF-RTF Framework for Speech Enhancement with DNN-based Speech Presence Estimation. J. M. Martín-Doñas, A. M. Peinado, I. López-Espejo, A. M. Gomez, Proc. of IberSPEECH 2020.

[December 2020] Papers of CASPR members:

UIAI System for Short-Duration Speaker Verification Challenge 2020. M. Sahidullah, A.K. Sarkar, V. Vestman, X. Liu, R. Serizel, T. Kinnunen, Z.-H. Tan, E. Vincent, Proc. of the 8th IEEE Spoken Language Technology Workshop (SLT 2021).

CC-loss: Channel Correlation Loss for Image Classificaiton. Z. Song, D. Chang, Z. Ma, X. Li and Z.-H. Tan, Proc. of the 25th International Conference on Pattern Recognition (ICPR 2020).

[December 2020] [08.12.2020] Jesper Jensen participated in the Denmark Radio tech program called “Kortsluttet” (Hard-wired) on digitally enhanced senses, where about Hearing Aid Technology now and in the future. The program can be found here: (The CASPR part starts at 16:35.)

[October 2020] Papers by CASPR members were presented during the 21st Annual Conference of the International Speech Communication Association (INTERSPEECH 2020):

  • Vocoder-Based Speech Synthesis from Silent Videos. D. Michelsanti, O. Slizovskaia, G. Haro, E. Gómez, Z.-H. Tan, J. Jensen. 10.21437/Interspeech.2020-1026
  • End-to-end Speech Intelligibility Prediction Using Time-Domain Fully Convolutional Neural Networks. M. B. Pedersen , M. Kolbæk , A. H. Andersen , S. H. Jensen , J. Jensen. 10.21437/Interspeech.2020-1740

[October 2020] A new demo about audio-visual speech inpainting is available on the demo section.

[October 2020] CASPR Winter School postponed to 2021

Dear participants at the CASPR Winter School

We have decided to postpone the CASPR Winter School.
We hope to be able to give the Winter School in April/May 2021 and that you will still wish to participate at that time. Please regularly check the CASPR Website or the PhD moodle pages for information about the upcoming Winter Schools. 
The unfortunate decision to postpone the CASPR WS has been very difficult, but we have found that the restrictions and uncertainty caused by Covid-19 have made this the least bad choice. 
Thanks again for your understanding, and our apologies for any inconvenience this may have caused you. Obviously, you will be fully re-imbursed for the registration fee, and we will soon get back with further information about this. 

[July 2020] Papers of CASPR members:

  • Deep InterBoost Networks for Small-sample Image Classification. X. Li, D. Chang, Z. Ma, Z.-H. Tan, J.-H. Xue, J. Cao and J. Guo. Accepted by Neurocomputing, 2020.
  • Exploring Filterbank Learning for Keyword Spotting. I. L. Espejo, Z.-H. Tan, J. Jensen, Proc. Eusipco 2020, Accepted.
  • Vocoder-Based Speech Synthesis from Silent Videos. D. Michelsanti, O. Slizovskaia, G. Haro, E. Gómez, Z.-H. Tan, J. Jensen. Proc. of Interspeech 2020, Accepted.
  • Shouted Speech Compensation for Speaker Verification Robust to Vocal Effort Conditions. S. Prieto, A. Ortega, I. López-Espejo, E. Lleida. Proc. of Interspeech 2020, Accepted.
  • End-to-end Speech Intelligibility Prediction Using Time-Domain Fully Convolutional Neural Networks. M. B. Pedersen , M. Kolbæk , A. H. Andersen , S. H. Jensen , J. Jensen. Proc. of Interspeech 2020, Accepted.

[April 2020] Papers of CASPR members:

  • On the Comparisons of Decorrelation Approaches for non-Gaussian Neutral Vector Variables. Z. Ma, X. Lu, J. Xie, Z. Yang, J.-H. Xue, Z.-H. Tan, B. Xiao, J. Guo. Accepted by IEEE Transactions on Neural Networks and Learning Systems, 2020.
  • Improved External Speaker-Robust Keyword Spotting for Hearing Assistive Devices. I. López-Espejo, Z.-H. Tan and J. Jensen. Accepted by IEEE/ACM Transactions on Audio, Speech and Language Processing, 2020.
  • OSLNet: Deep Small-Sample Classification with an Orthogonal Softmax Layer. X. Li, D. Chang, Z. Ma, Z.-H. Tan, J.-H. Xue, J. Cao, J. Yu, J. Guo. Accepted by IEEE Transactions on Image Processing, 2020.

[April 2020] A new demo about vocoder-based speech synthesis from silent videos is available on the demo section.

[March 2020] Ph.D. student Giovanni Morrone from the University of Modena and Reggio Emilia, Italy, is visiting CASPR for five months in the period March to July 2020. Giovanni is working on new approches in speech inpainting that will exploit both audio and visual information (e.g. lip-reading) to recover missing parts of corrupted speech.

Giovanni Morrone received the B.Sc. and the M.Sc. degrees in Computer Engineering from the University of Modena and Reggio Emilia, Italy, in 2015 and 2017, respectively. Since 2017, he joined the DBGroup research lab in the Department of Engineering, University of Modena and Reggio Emilia. He did two internships at Expert System, working on chatbots powered with semantic technologies, and at PerVoice, working on speech recognition applications in extremely noisy environments. His research interests include speech enhancement, speech separation and speech recognition in very challenging environments (e.g. cocktail party scenarios, very low SNR) using deep learning.

You can find more information about his research work in his webpage.

[January 2020] We are happy to announce that the 6th Newsletter of CASPR has now been released.  You can obtain a copy from our website:

[January 2020] Papers of CASPR members:

  • On Loss Functions for Supervised Monaural Time-Domain Speech Enhancement. M. Kolbæk, Z.-H. Tan, S. H. Jensen and J. Jensen. IEEE/ACM Transactions on Audio, Speech and Language Processing, vol. 28, pp. 825-838, January 2020.
  • The Importance of Context When Recommending TV Content: Dataset and Algorithms. M. S. Kristoffersen, S. E. Shepstone, and Z.-H. Tan. Accepted by IEEE Transactions on Multimedia.
  • SketchSegNet+: An End-to-end Learning of RNN for Multi-Class Sketch Semantic Segmentation. Y. Qi and Z.-H. Tan. Accepted by IEEE Access.
  • Rate-Constrained Noise Reduction in Wireless Acoustic Sensor Networks, J. Amini, R. C. Hendriks, R. Heudsens, M. Guo, J. Jensen, IEEE Transactions Audio, Speech and Language Processing, Vol. 28, No.1, pp. 1-12, Jan. 2020.
  • S. Samizade, Z.-H. Tan, C. Shen, X. Guan, “Adversarial Example Detection by Classification for Deep Speech Recognition,” Proc. International Conference on Acoustics, Speech and Signal Processing (ICASSP). Accepted.
  • A Neural Network for Monuaral Intrusive Speech Intelligibility Prediction. M. B. Pedersen, A. H. Andersen, S. H. Jensen, J. Jensen. Proc. International Conference on Acoustics, Speech and Signal Processing (ICASSP). Accepted.
  • Maximum Likelihood Estimation of the Interference-plus-noise Cross Power Spectral Density Matrix for Own Voice Retrieval. P. Hoang, Z.-H. Tan, T. Lunner, J. M. de Haan, J. Jensen. Proc. International Conference on Acoustics, Speech and Signal Processing (ICASSP). Accepted.
  • A Constrained Maximum Likelihood Estimator of Speech and Noise Spectra with Application to Multi-Microphone Noise Reduction. A. Zahedi, M. S. Pedersen, J. Østergaard, L. Bramsløw, T. U. Christiansen, J. Jensen. Proc. International Conference on Acoustics, Speech and Signal Processing (ICASSP). Accepted.
  • Robust Joint Estimation of Multimicrophone Signal Model Parameters. A. Koutrouvelis, R. Hendriks, R. Heusdens, J. Jensen. Proc. International Conference on Acoustics, Speech and Signal Processing (ICASSP). Accepted.
  • Rate-Constrained Noise Reduction in Wireless Acoustic Sensor Networks. J. Amini, R. C. Hendriks, R. Heusdens, M. Guo, J. Jensen. Proc. International Conference on Acoustics, Speech and Signal Processing (ICASSP). Accepted.
  • The Exponential Distribution in Rate Distortion Theory: The Case of Compression with Independent Encodings. U. Erez, J. Østergaard, and R. Zamir. Proc. IEEE Data Compression Conference, 2020. Accepted.

[December 2019] PhD researcher Daniel Michelsanti attended the Deep Learning Barcelona Symposium (DLBCN 2019) presenting the following article:

[November 2019] Members of CASPR are part in two new industrial PhD Projects.

  • Cortical tracking of auditory object-based selective attention to improve perceived sound quality. PhD student: Adele Simon
    Company: B&O
  • Informed adaptive multi-microphone pre-processing based speech enhancement for wireless speech communication. PhD student: Andreas Jonas Fuglsig.
    Company: RTX A/S

[November 2019] Papers of CASPR members:

  • Zero-delay multiple descriptions of stationary scalar Gauss-Markov sources. A. Fuglsig, J. Østergaard. Entropy, MDPI, 2019. Accepted.
  • Robust Bayesian and Maximum a Posteriori Beamforming for Hearing Assistive Devices. P. Hoang, Z.-H. Tan, J. M. de Haan, T. Lunner and J. Jensen. The 7th IEEE Global Conference on Signal and Information Processing (GlobalSIP 2019), Nov. 11-14, 2019, Shaw Centre, Ottawa, Canada.

[November 2019] Prof. (MSO) Jan Østergaard is appointed Head of the Section on Signal and Information Processing, Department of Electronic Systems, Aalborg University.

[November 2019] Prof. Zheng-Hua Tan is elected as Vice Chair of the Machine Learning for Signal Processing Technical Committee (MLSP TC) of the IEEE Signal Processing Society for 2020 and shall become Chair of the TC for the term of 2021-2022. The MLSP TC promotes activities in machine learning for signal processing, organizes MLSP sessions at ICASSP and runs the annual MLSP workshops. MLSP is a fast-growing community, as exemplified by as many as 26 MLSP sessions at ICASSP 2019, the IEEE flagship conference in signal processing. MLSP 2018 was held in Aalborg with Zheng-Hua Tan as the general chair.

[November 2019] Prof. Zheng-Hua Tan is appointed as an Associate Editor for IEEE/ACM Transactions on Audio, Speech and Language Processing for a three-year term.

[November 2019] CASPR was part of organising an Open Lab together with the Section on Signal and Information Processing, Danish Sound Network, and Brains Business. The event was a success, where more than 50 people participated.

[October, 2019] Papers of CASPR members:

  • Deep-Learning-Based Audio-Visual Speech Enhancement in Presence of Lombard Effect. D. Michelsanti, Z.-H. Tan, S. Sigurdsson, J. Jensen. Speech Communication, Elsevier. Accepted October 2019.
  • Soft Dropout and Its Variational Bayes Approximation. J. Xie, Z. Ma, G. Zhang, J.-H. Xue, Z.-H. Tan and J. Guo. 2019 IEEE International Workshop on Machine Learning for Signal Processing (MLSP 2019), Oct. 13–16, 2019, Pittsburgh, PA, USA.

[October 2019] Save the date. On Thursday November 7th from 9.00 – 12.00 we are organising an Open House in CASPR together with the Section on Signal and Information Processing, where CASPR is anchored.

Experience state-of-the-art acoustic laboratories and setups such as spatial sound analysis and synthesis for making a room sound like a car or a church, and listen to sound zones setups, where the sound field is controlled to allow for different music being played in different parts of the same room. Catch up on the latest research in artificial intelligence and sound for hearing assistive devices from the Centre on Acoustic Signal Processing Research (CASPR).

The day will be spiced up with lab tours, demonstrations, and poster sessions on our broad research areas such as deep learning for speech enhancement, speaker separation, and keyword spotting, audio-visual hearing assistive devices, speech intelligibility and prediction, low frequency chamber and anechoic room demonstrations, reconfigurable hardware, context-aware recommendations, user experience design, better hearing and rehabilitation, sound zones setups and simulators, and a lot more.

The address is: Fredrik Bajers Vej 7A, 9220 Aalborg Øst. All are welcome. Registration, coffee, and networking starts at 8.30. For more information contact Prof. MSO Jan Østergaard (

[October, 2019] CASPR members participated in the Danish Sound Day 2019.

  • Poul Hoang participated in the Research Pitch Battle with his work on User-symbiotic speech enhancement for hearing aids.
  • Adel Zahedi presented a poster on his work on Brain inspired jointly optimal hearing loss compensation and noise reduction for hearing assistive devices.
  • Ivan Lopez-Espejo presented a poster on his work on low-resource keyword spotting for hearing assistive devices.

[September 2019] Postdoctoral researcher Iván López Espejo attended the 20th Annual Conference of the International Speech Communication Association (INTERSPEECH 2019) presenting the following article:

  • Keyword Spotting for Hearing Assistive Devices Robust to External Speakers. I. López-Espejo, Z.-H. Tan and J. Jensen. 10.21437/Interspeech.2019

[September 2019] Papers of CASPR members:

  • Deep Joint Embeddings of Context and Content for Recommendation. M. S. Kristoffersen, J. L. Wieland, S. E. Shepstone, Z.-H. Tan and V. Vinayagamoorthy. CARS 2.0 – Workshop on Context-Aware Recommender Systems, in conjunction with RecSys’ 2019, 20 September 2019, Copenhagen, Denmark.

[September 2019] Ph.D. student Juan Manuel Martín Doñas from the University of Granada, Spain, is visiting CASPR for three months in the period September to November 2019. Juan Manuel will be working on new ideas in multi-channel speech enhancement.

Juan M. Martín Doñas received the B.Sc. and the M.Sc. degrees in Telecommunications Engineering from the University of Granada, Spain, in 2015 and 2017, respectively. Since 2017, he holds an FPU fellowship for working towards the Ph.D. degree with the Department of Signal Theory, Telematics and Communications, University of Granada. He is a member of the research group Signal Processing, Multimedia Transmission and Speech/Audio Technologies (SigMAT). His research interests include speech enhancement, multi-channel signal processing and deep learning applied to speech research.

[September 2019] Morten Kolbæk’s research on intelligent hearing aids have been documented in a podcast and an article on (in Danish)

[September 2019] On September 1st 2019, Morten Østergaard Nielsen started a PhD Project entitled “Training Methods for DNNs Under Computational Resource Constraints”, supervised by CASPR members.

[August 2019] The fifth Newsletter from CASPR has been released: CASPR Newsletter_August2019

[June, 2019] Papers of CASPR members:

  • rVAD: An Unsupervised Segment-Based Robust Voice Activity Detection Method. Z.-H. Tan, A. Sarkar, and N. Dehak, accepted by Computer Speech and Language. Source code:
  • Time-Contrastive Learning Based Deep Bottleneck Features for Text-Dependent Speaker Verification. A. Sarkar, Z.-H. Tan, H. Tang, S. Shon, and J. Glass, IEEE Transactions on Audio, Speech and Language Processing, vol. 27, no. 8, pp.1267-1279, August 2019.

[May, 2019] J. Østergaard has received a Project 2 research grant from the Danish Council for Independent Research for the project entitled “Effortless Hearing in Noise by Brain Feedback”. The project is within the interest-sphere of CASPR.

[May, 2019] A new demo about deep-learning-based audio-visual speech enhancement in presence of Lombard effect is available on the demo section.

[May, 2019] PhD researcher Daniel Michelsanti attended the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2019) presenting the following articles:

  • On Training Targets and Objective Functions for Deep-Learning-Based Audio-Visual Speech Enhancement. D. Michelsanti, Z.-H. Tan, S. Sigurdsson and J. Jensen. 10.1109/ICASSP.2019.8682790
  • Effects of Lombard Reflex on the Performance of Deep-Learning-Based Audio-Visual Speech Enhancement Systems. D. Michelsanti, Z.-H. Tan, S. Sigurdsson and J. Jensen. 10.1109/ICASSP.2019.8682713

[May, 2019] CASPR has a new fully funded PhD position available within “Processing of Sound Signals in Noise for Hearing Assistive Devices”. In this PhD project, model-driven and/or data-driven statistical methods will be exploited in order to process noisy speech and audio signals. The model-driven approach would be based on an interplay between information theoretical and human perceptual models. The data-driven approach could be using deep neural networks.

The successful applicant must have a Master’s degree in Statistical Signal Processing, Information Theory, Estimation Theory, Machine Learning or Auditory Processing, and have extensive knowledge in one or more of the disciplines. Excellent undergraduate and master degree grades are desired. A high level of written and spoken English is also expected.

To apply for this PhD position, please see details on the following page: PhD position

[April, 2019] CASPR PhD student Poul Hoang won the 3-minute pitch competition for performing the best pitch at the industrial PhD course held by the Innovation Fond and Copenhagen Business School in Spring 2019. The topic of the pitch was on presenting the intended value and impact of the industrial PhD project for a wider audience.

[March, 2019] Papers of CASPR members:

  • Sound Quality Improvement for Hearing Aids in the Presence of Multiple Inputs. A. Kar, A. Anand, J. Østergaard, S.H. Jensen, M.N.S. Swarmy. Circuits, Systems, and Signal Processing, Springer, Accepted March 2019.
  • Information Loss in the Human Auditory System. M. Z. Jahromi, A. Zahedi, J. Jensen, and J. Østergaard, IEEE Trans. Audio, Speech, Language Process., Vol. 27, No. 3, pp. 472-481, March 2019.
  • Effects of Lombard Reflex on the Performance of Deep-Learning-Based Audio-Visual Speech Enhancement Systems. D. Michelsanti, Z.-H. Tan, S. Sigurdsson and J. Jensen, 2019 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2019), Brighton, UK, May 12-17, 2019.
  • On Training Targets and Objective Functions for Deep-Learning-Based Audio-Visual Speech Enhancement. D. Michelsanti, Z.-H. Tan, S. Sigurdsson and J. Jensen, 2019 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2019), Brighton, UK, May 12-17, 2019.

[February, 2019] Papers of CASPR members:

  • On the Relationship between Short-Time Objective Intelligibility and Short-Time Spectral-Amplitude Mean-Square Error for Speech Enhancement. M. Kolbæk, Z.-H. Tan and J. Jensen, IEEE Trans. Audio, Speech, Language Process., Vol. 27, No. 2, pp. 283-295, February 2019.
  • A Convex Approximation of the Relaxed Binaural Beamforming Optimization Problem. A. I. Koutrouvelis, R. C. Hendriks, R. Heusdens, and J. Jensen, IEEE Trans. Audio, Speech, Language Process., Vol. 27, No. 2, pp. 321-331, February 2019.
  • Subjective Annotations for Vision-Based Attention Level Estimation. A. Coifman, P. Rohoska, M.S. Kristoffersen, S.E. Shepstone, and Z.-H. Tan,  The 14th International Conference on Computer Vision Theory and Applications (VISAPP 2019), Prague, Czech Republic, 25-27 February 2019.

[February, 2019] New researcher in CASPR:

  • On Feb. 2019, Adel Zahedi started an industrial postdoc project entitled “Brain-Inspired Jointly Optimal Hearing Loss Compensation and Noise Reduction for Hearing Assistive Devices” in a collaboration between CASPR and Oticon.

[January, 2019] Papers of CASPR members:

  • Asymmetric Coding for Rate-Constrained Noise Reduction in Binaural.J. Amini, R. C. Hendriks, R. Heusdens, M. Guo, and J. Jensen, IEEE Trans. Audio, Speech, Language Process., Vol. 27, No. 1, pp. 154-167, January 2019.

[January, 2019] New researcher in CASPR:

  • On January 2019, Iván López Espejo started a postdoc project entitled “Low-Resource Keyword Spotting for Hearing Assistive Devices” at CASPR.

[January, 2019] CASPR members in the press:

  • “Kunstige neurale netværk skal gøre livet lettere for høreapparatbrugere (Artificial neural networks will make hearing aid users’ life easier)”, with a number of news outlets:, Dagbladet Ringsted, Sjællandske Næstved, DR P4 Nordjylland Nyheder, DR P4 Radioavisen, Nordvestnyt Holbæk/Odsherred, Ritzaus Bureau, Dagbladet Roskilde, Nordvestnyt Kalundborg,, Dagbladet Køge, Sjællandske Slagelse,, Flensborg Avis.

[November, 2018] Papers of CASPR members:

  • Mean Square Performance Evaluation in Frequency Domain for an Improved Adaptive Feedback Cancellation in Hearing Aids. A. Kar, A. Anand, J. Østergaard, S.H. Jensen, and M.N.S. Swarmy. Accepted for publication in Signal Processing, Elsevier Journal, 2019.
  • Information Loss in the Human Auditory System. M. Z. Jahromi, A. Zahedi, J. Jensen, and J. Østergaard. Accepted for publication in IEEE Trans. Audio, Speech, Language Process., 2018.

[October, 2018] Papers of CASPR members:

  • Public Perception of Android Robots: Indications from an Analysis of YouTube Comments. E. Vlachos and Z.-H. Tan, the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2018), Madrid, Spain, 1-5 October 2018.
  • Non-Intrusive Speech Intelligibility Prediction using Convolutional Neural Networks. A. H. Andersen, J. M. de Haan, Z.-H. Tan, and J. Jensen, IEEE Trans. Audio, Speech, Language Process., Vol. 26, No. 10, pp. 1925-1939, Oct. 2018.

[September, 2018] Prof. Wai-Yip Geoffrey Chan from Queens University, Kingston, Canada is visiting CASPR for four months in the period September to December 2018. Prof. Chan will be working on new ideas in deep learning for speech intelligibility enhancement.

WAI-YIP Geoffrey Chan received the B.Eng. and M.Eng. degrees from Carleton University, Ottawa, and the Ph.D. degree from the University of California at Santa Barbara, CA, USA, all in electrical engineering. He has held positions with the Communications Research Centre, Bell Northern Research (Nortel), McGill University, and the Illinois Institute of Technology. He is currently a Professor with the Department of Electrical and Computer Engineering, Queen’s University,
Canada. His research interests are in communications, speech, and multimedia signal processing and coding. He was an Associate Editor of the IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING. He is an Associate Editor of the EURASIP Journal on Audio, Speech, and Music Processing. He has helped organize several IEEE sponsored conferences on communi- cations, speech coding, and image processing. He received the CAREER Award from the U.S. National Science Foundation.

[September, 2018] New researchers in CASPR:

  • On September 2018, Morten Kolbæk started a postdoc project entitled “Postdoc Project: Intelligibility-Aware Hearing Assistive Devices – Intelligibility Enhancement” supervised by CASPR members.
  • On September 2018, Mathias Bach Pedersen started a PhD Project entitled “Intelligibility-Aware Hearing Assistive Devices – Intelligibility Prediction”, supervised by CASPR members.

[September, 2018] Papers of CASPR members:

  • Refinement and validation of the binaural short time objective intelligibility measure for spatially diverse conditions, A. H. Andersen, J. M. de Haan, Z.-H. Tan, and J. Jensen, Speech Communication, Vol. 102, pp. 1-13, Sept. 2018

[August, 2018] New researcher in CASPR:

  • On August 2018, Poul Hoang started a PhD project entitled “User-Symbiotic Speech Enhancement for Hearing Aid Systems” in a collaboration between CASPR and Oticon.

[August, 2018] Papers of CASPR members:

  • Multi-Task Adversarial Network Bottleneck Features for Noise-Robust Speaker Verification. H. Yu, T. Hu, Z. Ma, Z.-H. Tan and J. Guo, IEEE International Conference on Network Infrastructure and Digital Content (IC-NIDC 2018), Guiyang, China, August 22 – 24, 2018.
  • The Sound or Silence: investigating the influence of robot noise on proxemics. G. Trovato, R. Paredes, J. Balvin, F. Cuellar, N.B. Thomsen, S. Bech, and Z.-H. Tan, the 27th IEEE International Conference on Robot and Human Interactive Communication, RO-MAN 2018, Nanjing and Tai’an, China, 27-31 August 2018.

[July, 2018] The third Newsletter from CASPR has been released:

[July, 2018] Papers of CASPR members:

  • Zero-Delay Rate Distortion via Filtering for Vector-Valued Gaussian Sources. P. A. Stavrou, J. Østergaard, and C. Charalambous. IEEE Journal of Selected Topics in Signal Processing, July, 2018.

[June, 2018] CASPR has a new fully funded PhD stipend available within Signal Quality Estimation for Speech Enhancement using Miniature EEG Devices.

The main objective of this PhD project is to estimate the perceived speech or sound quality from EEG signals recorded by in-ear and around the ear EEG devices. Such compact EEG devices may be integrated into various hearing assistive devices (HADs), for example to help guide the signal processing in the HADs.

In this PhD project, a signal processing and information theoretic approach will be pursued, which involves the use of recent results on information losses in the human auditory system, fundamental information flows in the EEG signals, and variants of transfer entropy.

The stipends are open for appointment from 1 September 2018, or as soon as possible thereafter.

To apply for the position please use the following link:

[June, 2018] Papers of CASPR members:

  • Refinement and Validation of the Binaural Short Time Objective Intelligibility Measure for Spatially Diverse Conditions. A.H. Andersen, J.M. de Haan, Z.-H. Tan and J. Jensen, accepted by Speech Communication.
  • Non-Intrusive Speech Intelligibility Prediction using Convolutional Neural Networks. A.H. Andersen, J.M. de Haan, Z.-H. Tan and J. Jensen, accepted by IEEE/ACM Transactions on Audio, Speech and Language Processing.
  • Effectiveness of Single-Channel BLSTM Enhancement for Language Identification. P.S. Frederiksen, J. Villalba, S. Watanabe, Z.-H. Tan and N. Dehak, accepted by Interspeech 2018, Hyderabad, India, September 2-6, 2018.

[May, 2018] Papers of CASPR members:

  • A Spatial Self-Similarity Based Feature Learning Method for Face Recognition under Varying Poses. X. Duan and Z.-H. Tan, accepted by Pattern Recognition Letters, 2018.

[March, 2018] Papers of CASPR members:

  • Bias-compensated Informed Sound Source Localization Using Relative Transfer Functions. M. Farmani, M. S. Pedersen, Z.-H. Tan, and J. Jensen, accepted by IEEE/ACM Transactions on Audio, Speech and Language Processing, 2018.

[February, 2018] Papers of CASPR members:

  • Using Closed-set Speaker Identification Score Confidence to Enhance Audio-based Collaborative Filtering for Multiple Users. S.E. Shepstone, Z.-H. Tan and M.S. Kristoffersen, accepted by IEEE Transactions on Consumer Electronics, 2018.
  • Evaluation and Comparison of Late Reverberation Power Spectral Density Estimators. S. Braun, A. Kuklasinski, O. Schwartz, O. Thiergart, E.A.P. Habets, S. Gannot, S. Doclo, and J. Jensen, accepted by IEEE/ACM Transactions on Audio, Speech and Language Processing, 2018.

[January, 2018] Papers of CASPR members:

  • Monaural Speech Enhancement Using Deep Neural Networks by Maximizing a Short-Time Objective Intelligibility Measure. M. Kolbæk, Z.-H. Tan and J. Jensen, The 43th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2018), 15-20 April 2018, Calgary, Alberta, Canada.
  • Fixed-Rate Zero-Delay Source Coding for Stationary Vector-Valued Gauss-Markov Sources. P. A. Stavrou and J. Østergaard, IEEE Data Compression Conference (DCC), March 2018.
  • A Perceptually Motivated LP Residual Estimator in Noisy and Reverberant Environments. R. Peng, Z.-H. Tan, X. Li, and C. Zheng, accepted by Speech Communication, 2017.

[December, 2017] The second Newsletter from CASPR has been released: CASPR_Newsletter_December17.pdf

[November, 2017] Prof. Zheng-Hua Tan gave his Inaugural Lecture as Professor of Machine Learning and Speech Processing at Aalborg University. Title of the lecture: “Deep Learning for Signals and Data: Intelligent Machines are Reshaping Our World.

[November, 2017] Papers of CASPR members:

  • An Upper Bound to Zero-Delay Rate Distortion via Kalman Filtering for Vector Gaussian Sources. P. A. Stavrou, J. Østergaard, C. Charalambos, and M. Derpich. Proceedings of the IEEE Information Theory Workshop, Kaohsiung, Taiwan, 2017.
  • Spoofing Detection in Automatic Speaker Verification Systems Using DNN Classifiers and Dynamic Acoustic Features. H. Yu, Z.-H. Tan, Z. Ma, R. Martin, and J. Guo, accepted by IEEE Transactions on Neural Networks and Learning Systems, 2017.
  • Time-Contrastive Learning Based DNN Bottleneck Features for Text-Dependent Speaker Verification. A.K. Sarkar and Z.-H. Tan, NIPS 2017 Time Series Workshop, Long Beach, CA, USA, Dec. 8, 2017.

[September, 2017] New researcher in CASPR:

On September 2017, Daniel Michelsanti started a PhD project entitled “Audio-Visual Speech Enhancement for Hearing Assistive Devices” at CASPR.

[September, 2017] Papers of CASPR members:

  • Robust Voice Liveness Detection and Speaker Verification Using Throat Microphones. M. Sahidullah, D.A.L. Thomsen, R.G. Hautamaki, T. Kinnunen, Z.-H. Tan, R. Parts, M. Pitkanen, accepted by IEEE/ACM Transactions on Audio, Speech and Language Processing, 2017.
  • iSocioBot – A Multimodal Interactive Social Robot. Z.-H. Tan, N.B. Thomsen, X. Duan, E. Vlachos, S.E. Shepstone, M.H. Rasmussen and J.L. Højvang, accepted by International Journal of Social Robotics, 2017.
  • Weighted Score Based Fast Converging CO-training with Application to Audio-Visual Person Identification. X. Duan, N.B. Thomsen, Z.-H. Tan, B. Lindberg and S.H. Jensen,  The 29th IEEE International Conference on Tools with Artificial Intelligence (ICTAI2017), Boston, Massachusetts, USA, Nov. 6-8, 2017.
  • An Upper Bound to Zero-Delay Rate Distortion via Kalman Filtering for Vector Gaussian Sources. P. A. Stavrou, J. Østergaard, C. Charalambos, and M. Derpich. Proceedings of the IEEE Information Theory Workshop, Kaohsiung, Taiwan, 2017.

[August, 2017] Papers of CASPR members:

  • Incorporating Pass-Phrase Dependent Background Models for Text-Dependent Speaker Verification. A. Sarkar and Z.-H. Tan, accepted by Computer Speech & Language, 2017.
  • Latent Dirichlet Mixture Model. J.-T. Chien, C.-H. Lee and Z.-H. Tan, accepted by Neurocomputing, 2017.
  • Visual Detection of Events of Interest from Urban Activity. S. Astaras, A. Pnevmatikakis and Z.-H. Tan, accepted by Wireless Personal Communications, 2017.
  • Joint Separation and Denoising of Noisy Multi-Talker Speech Using Recurrent Neural Networks and Permutation Invariant Training, M. Kolbæk, D. Yu, Z.-H. Tan and J. Jensen, accepted by the IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP), Tokyo, Japan, 25-28 September 2017.

[July, 2017] We are happy to announce that the registration for the Winter School on Signal Processing for Hearing Assistive Devices has now been opened. The Winter School takes place at Aalborg University during the period 6 -10 November, 2017.

Registration fee:
PhD students: 1500,- DKK
Industry: 8000,- DKK

To register for the Winter School please use the following link:

Important: The link to payment can be found under the section Course Fee in the above link.

A description of the course can be found on the CASPR website:

[July, 2017] Papers of CASPR members:

  • Morten Kolbæk, Dong Yu, Zheng-Hua Tan and Jesper Jensen, “Multi-talker Speech Separation with Utterance-level Permutation Invariant Training of Deep Recurrent Neural Networks”, accepted by IEEE Transactions on Audio, Speech and Language Processing, 2017.

[9th June, 2017]
The first Newsletter from CASPR has been released: CASPR_Newsletter_June17

[June, 2017] Papers of CASPR members to appear at Interspeech 2017:

  • Humans do not maximize the probability of correct decision when recognizing DANTALE words in noise. Z. Jahromi, J. Østergaard, and J. Jensen, Proc. Interspeech 2017, Stockholm, Sweden, 2017, to appear.
  • On the use of Band Importance Weighting in the Short-Time Objective Intelligibility Measure. A.H. Andersen, J.M. de Haan, Z.-H. Tan and J. Jensen, Proc. Interspeech 2017, Stockholm, Sweden, 2017, to appear.
  • Adversarial Network Bottleneck Features for Noise Robust Speaker Verification. H. Yu, Z.-H. Tan, Z. Ma and J. Guo, Proc. Interspeech 2017, Stockholm, Sweden, 2017, to appear.
  • Conditional Generative Adversarial Networks for Speech Enhancement and Noise-Robust Speaker Verification. D. Michelsanti and Z.-H. Tan, Proc. Interspeech 2017, Stockholm, Sweden, 2017, to appear.
  • Improving Speaker Verification Performance in Presence of Spoofing Attacks Using Out-of-Domain Spoofed Data. A. Sarkar, Md Sahidullah, Z.-H. Tan and T. Kinnunen, Proc. Interspeech 2017, Stockholm, Sweden, 2017, to appear.

[May, 2017] J. Jensen of CASPR has received a research grant from the Danish Council for Independent Research for the project entitled “Intelligibility-Aware Hearing Assistive Devices”, which is in the interest-sphere of CASPR. The project involves a 3-year phd track and a 3-year postdoc track.

Abstract: Hearing assistive devices, such as headsets for speech communication in noisy environments and hearing aid systems, cochlear implants, etc., aim at improving the speech intelligibility (SI) for the user. To do so, the hearing assistive devices process the acoustic signals, before they are presented to the ears of the user. The research project explores deep-learning based methods for predicting the SI experienced by the user in a given acoustic situation (the phd track) and enhancing the SI by processing the microphone signals before they are presented the the ears of the user (the postdoc track). The project will take place at the Section for Signal and Information Processing (SIP), Department of Electronic Systems, Aalborg University.

For more information on the open phd and postdoc positions, please consult or contact Professor Jesper Jensen (email:

[7th February, 2017] On February 2nd, Professor Patrick Naylor from Imperial College London, visited our group to discuss future research collaboration and to give a presentation entitled “Measurement and Exploitation of Reverberation in Speech signals”.

[March, 2017] Papers of CASPR members to appear at IEEE Data Compression Conference 2017:

  • An Asymmetric Difference Multiple Description Gaussian Noise Channel. J. Østergaard, Y. Kochman, and R. Zamir, IEEE Data Processing Conference, April, 2017.

[20th January, 2017] In connection to the official opening of our new research centre within the area of acoustic signal processing: Centre for Acoustic Signal Processing Research (CASPR), you are invited to an afternoon event with technical presentations, demos, and lab tours at the Section for Signal and Information Processing (SIP), Department of Electronic Systems, Aalborg University.

Date: March 2., 2017.
Place: Aalborg Universitet, Fredrik Bajers Vej 7, Room A4-108.
Free registration. Use the link to sign up: Sign Up
Technical program can be downloaded here: invitation

[4th January, 2017] A paper co-authored by J. Jensen of CASPR received an IEEE Signal Processing Society Best Paper Award.
An Algorithm for Intelligibility Prediction of Time–Frequency Weighted Noisy Speech, C. H. Taal, R. C. Hendriks, R. Heusdens and J. Jensen, Transactions on Audio, Speech, and Language Processing, Volume 19, No. 7, September 2011.
For this award, papers in a 5 year window spanning from 2011 – 2016 are considered. The award honors the author(s) of a paper of exceptional merit dealing with a subject related to the Society’s technical scope, and appearing in one of the Society’s solely owned transactions or the Journal of Selected Topics in Signal Processing, irrespective of the author’s age.

[15th December, 2016]  We will be organizing a Winter School on Signal Processing for Hearing Assistive Devices at Aalborg University, November 6 – 10, 2017. For details see:

[December, 2016]  Papers of CASPR members to appear at ICASSP 2017:

  • Permutation Invariant Training of Deep Models for Speaker-Independent Multi-Talker Speech Separation. D. Yu, M. Kolbæk, Z.-H. Tan, J. Jensen, Proc. International Conf. Audio, Speech, Signal Proc. (ICASSP), 2017.
  • A Non-Intrusive Short-Time Objective Intelligibility Measure. A. H. Andersen, J. M. de Haan, Z.-H. Tan, and J. Jensen, Proc. International Conf. Audio, Speech, Signal Proc. (ICASSP), 2017.
  • RedDots Replayed: A New Replay Spoofing Attack Corpus for Text-dependent Speaker Verification Research. T. Kinnunen, M. Sahidullah, M. Falcone, L. Costantini, R. Hautamaki, D. Thomsen, A. Sarkar, Z.-H. Tan, H. Delgado, M. Todisco, N. Evans, V. Hautamaki, and K.A. Lee, Proc. International Conf. Audio, Speech, Signal Proc. (ICASSP), 2017.