PUBLICATIONS

Members of CASPR  have been involved in the research documented in the following scientific publications:

Journal Papers

  1. M. Z. Jahromi, A. Zahedi, J. Jensen, and J. Østergaard, Information Loss in the Human Auditory System, IEEE Trans. Audio, Speech, Language Process., 2018, Accepted.
  2. Zero-Delay Rate Distortion via Filtering for Vector-Valued Gaussian Sources. P. A. Stavrou, J. Østergaard, and C. Charalambous. IEEE Journal of Selected Topics in Signal Processing, 2018. Accepted.
  3. Asymmetric Coding for Rate-Constrained Noise Reduction in Binaural Hearing Aids. J. Amini, R. C. Hendriks, R. Heusdens, M. Guo, and J. Jensen. IEEE Trans. Audio, Speech, Language Process., 2018. Accepted.
  4. Refinement and Validation of the Binaural Short Time Objective Intelligibility Measure for Spatially Diverse Conditions. A.H. Andersen, J.M. de Haan, Z.-H. Tan and J. Jensen. Elsevier Speech Communication, Vol. 102, pp. 1-13, Sept. 2018.
  5. Non-Intrusive Speech Intelligibility Prediction using Convolutional Neural Networks. A.H. Andersen, J.M. de Haan, Z.-H. Tan and J. Jensen. IEEE/ACM Transactions on Audio, Speech and Language Processing, Vol. 26, No. 10, pp. 1925-1939, Oct. 2018.
  6. A Spatial Self-Similarity Based Feature Learning Method for Face Recognition under Varying Poses. X. Duan and Z.-H. Tan, accepted by Pattern Recognition Letters, 2018.
  7. Bias-compensated Informed Sound Source Localization Using Relative Transfer Functions. M. Farmani, M. S. Pedersen, Z.-H. Tan, and J. Jensen, accepted by IEEE/ACM Transactions on Audio, Speech and Language Processing, 2018.
  8. Using Closed-set Speaker Identification Score Confidence to Enhance Audio-based Collaborative Filtering for Multiple Users. S.E. Shepstone, Z.-H. Tan and M.S. Kristoffersen, accepted by IEEE Transactions on Consumer Electronics, 2018.
  9. Evaluation and Comparison of Late Reverberation Power Spectral Density Estimators. S. Braun, A. Kuklasinski, O. Schwartz, O. Thiergart, E.A.P. Habets, S. Gannot, S. Doclo, and J. Jensen. Accepted in IEEE/ACM Transactions on Audio, Speech and Language Processing, 2018.
  10. A Perceptually Motivated LP Residual Estimator in Noisy and Reverberant Environments. R. Peng, Z.-H. Tan, X. Li, and C. Zheng, accepted by Speech Communication, 2017.
  11. Spoofing Detection in Automatic Speaker Verification Systems Using DNN Classifiers and Dynamic Acoustic Features. H. Yu, Z.-H. Tan, Z. Ma, R. Martin, and J. Guo, accepted by IEEE Transactions on Neural Networks and Learning Systems, 2017.
  12. Robust Voice Liveness Detection and Speaker Verification Using Throat Microphones. M. Sahidullah, D.A.L. Thomsen, R.G. Hautamaki, T. Kinnunen, Z.-H. Tan, R. Parts, M. Pitkanen, accepted by IEEE/ACM Transactions on Audio, Speech and Language Processing, 2017.
  13. iSocioBot – A Multimodal Interactive Social Robot. Z.-H. Tan, N.B. Thomsen, X. Duan, E. Vlachos, S.E. Shepstone, M.H. Rasmussen and J.L. Højvang, accepted by International Journal of Social Robotics, 2017.
  14. Incorporating Pass-Phrase Dependent Background Models for Text-Dependent Speaker Verification. A. Sarkar and Z.-H. Tan, accepted by Computer Speech & Language, 2017.
  15. Latent Dirichlet Mixture Model. J.-T. Chien, C.-H. Lee and Z.-H. Tan, accepted by Neurocomputing, 2017.
  16. Visual Detection of Events of Interest from Urban Activity. S. Astaras, A. Pnevmatikakis and Z.-H. Tan, accepted by Wireless Personal Communications, 2017.
  17. Multi-talker Speech Separation with Utterance-level Permutation Invariant Training of Deep Recurrent Neural Networks. M. Kolbæk, D. Yu, Z.-H. Tan and J. Jensen,  IEEE Transactions on Audio, Speech and Language Processing, Vol. 25, No. 10, pp. 1901-1913, 2017.
  18. DNN Filter Bank Cepstral Coefficients for Spoofing Detection. H. Yu, Z.-H. Tan, Y. Zhang, Z. Ma, and J. Guo, IEEE Access, to appear, 2017.
  19. Informed Sound Source Localization Using Relative Transfer Functions for Hearing Aid Applications. M. Farmani, M. S. Pedersen, Z.-H. Tan and J. Jensen, IEEE Trans. Audio, Speech, Language Process., Vol. 25, No. 3, pp. 611-623, 2017.
  20. Decorrelation of Neutral Vector Variables: Theory and Applications. Z. Ma, J.-H. Xue, A. Leijon, Z.-H. Tan, Z. Yang, and J. Guo,  IEEE Transactions on Neural Networks and Learning Systems. To appear.
  21. Audio-based Granularity-adapted Emotion Classification. S.W. Shepstone, Z.-H. Tan, and S.H. Jensen, IEEE Transactions on Affective Computing. To appear.
  22. Text-Independent Speaker Identification Using the Histogram Transform Model. Z. Ma, H. Yu, Z.-H. Tan, and J. Guo, IEEE Access. To appear.
  23. Multi-channel Wiener filters in binaural and bilateral hearing aids – speech intelligibility improvement and robustness to DoA errors. A. Kuklasiński and J. Jensen, Journal of the Audio Engineering Society., Vol. 25, No. 1/2, pp. 8 – 16, 2017.
  24. Relaxed Binaural LCMV Beamforming. A. I. Koutrouvelis, R. C. Hendriks, R. Heusdens and J. Jensen, IEEE Trans. Audio, Speech, Language Process., Vol. 25, No. 1, pp. 133 – 148, 2017.
  25. Speech Intelligibility Potential of General and Specialized Deep Neural Network Based Speech Enhancement Systems. M. Kolbæk, Z.-H. Tan and J. Jensen, IEEE Trans. Audio, Speech, Language Process., Vol. 25, No. 1, pp. 149 – 163, 2017.
  26. Source Coding in Networks with Covariance Distortion Constraints. A. Zahedi, J. Østergaard, S.H. Jensen, P. Naylor, and S. Bech, IEEE Transactions on Signal Processing, Vol. 64, Issue 22, pp. 5943 – 5958, November 2016.
  27. An Algorithm for Predicting the Intelligibility of Speech Masked by Modulated Noise Maskers. J. Jensen and C. H. Taal, IEEE Trans. Audio, Speech, Language Process., Vol. 24, No. 11, pp. 2009 – 2022, 2016.  Matlab code.
  28. Predicting the Intelligibility of Noisy and Nonlinearly Processed Binaural Speech. A. H. Andersen, Z.-H. Tan, J. M. de Haan, and J. Jensen, IEEE Trans. Audio, Speech, Language Process., Vol. 24, No. 11, pp. 1908 – 1920, 2016.

Conference Papers

  1. Public Perception of Android Robots: Indications from an Analysis of YouTube Comments. E. Vlachos and Z.-H. Tan, the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2018), Madrid, Spain, 1-5 October 2018.
  2. Multi-Task Adversarial Network Bottleneck Features for Noise-Robust Speaker Verification. H. Yu, T. Hu, Z. Ma, Z.-H. Tan and J. Guo, IEEE International Conference on Network Infrastructure and Digital Content (IC-NIDC 2018), Guiyang, China, August 22 – 24, 2018.
  3. The Sound or Silence: investigating the influence of robot noise on proxemics. G. Trovato, R. Paredes, J. Balvin, F. Cuellar, N.B. Thomsen, S. Bech, and Z.-H. Tan, the 27th IEEE International Conference on Robot and Human Interactive Communication, RO-MAN 2018, Nanjing and Tai’an, China, 27-31 August 2018.
  4. Effectiveness of Single-Channel BLSTM Enhancement for Language Identification. P.S. Frederiksen, J. Villalba, S. Watanabe, Z.-H. Tan and N. Dehak, accepted by Interspeech 2018, Hyderabad, India, September 2-6, 2018.
  5. M. Farmani, M. S. Pedersen, and J. Jensen, Sound Source Localization for Hearing Aid Applications using Wireless Microphones, Accepted for  IEEE Sensor Array and Multichannel Signal Processing Workshop, 2018.
  6. J. Amini, R. C. Hendriks, R. Heusdens, M. Guo and J. Jensen, Operational Rate-Constrained Noise Reduction for Generalized Binaural Hearing Aid Setups,  2018 Symposium on Information Theory and Signal Processing in the Benelux.
  7. A. Koutrouvelis, R.C. Hendriks, R. Heusdens, S. van de Par, J. Jensen, and M. Guo, Evaluation of Binaural Noise Reduction Methods in Terms of Intelligibility and Perceived Localization, Accepted for European Signal Processing Conference, 2018.
  8. J. Amini, R. C. Hendriks, R. Heusdens, M. Guo, and J. Jensen, Operational Rate-Constrained Beamforming in Binaural Hearing Aids, Accepted for European Signal Processing Conference, 2018.
  9. On Zero-Delay Source Coding of LTI Gauss-Markov Systems with Covariance Matrix Distortion Constraints. P. Stavrou, J. Østergaard, M. Skoglund, The European Control Conference (ECC), June 2018.
  10. Monaural Speech Enhancement Using Deep Neural Networks by Maximizing a Short-Time Objective Intelligibility Measure. M. Kolbæk, Z.-H. Tan and J. Jensen, The 43th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2018), 15-20 April 2018, Calgary, Alberta, Canada.
  11. Fixed-Rate Zero-Delay Source Coding for Stationary Vector-Valued Gauss-Markov Sources. P. A. Stavrou and J. Østergaard, IEEE Data Compression Conference (DCC), March 2018.
  12. Time-Contrastive Learning Based DNN Bottleneck Features for Text-Dependent Speaker Verification. A.K. Sarkar and Z.-H. Tan, NIPS 2017 Time Series Workshop, Long Beach, CA, USA, Dec. 8, 2017.
  13. Weighted Score Based Fast Converging CO-training with Application to Audio-Visual Person Identification. X. Duan, N.B. Thomsen, Z.-H. Tan, B. Lindberg and S.H. Jensen,  The 29th IEEE International Conference on Tools with Artificial Intelligence (ICTAI2017), Boston, Massachusetts, USA, Nov. 6-8, 2017.
  14. An Upper Bound to Zero-Delay Rate Distortion via Kalman Filtering for Vector Gaussian Sources. P. A. Stavrou, J. Østergaard, C. Charalambos, and M. Derpich. Proceedings of the IEEE Information Theory Workshop, Kaohsiung, Taiwan, 2017.
  15. Joint Separation and Denoising of Noisy Multi-Talker Speech Using Recurrent Neural Networks and Permutation Invariant Training, M. Kolbæk, D. Yu, Z.-H. Tan and J. Jensen, accepted by the IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP), Tokyo, Japan, 25-28 September 2017. Best student paper award.
  16. A lower bound to causal and zero delay rate distortion for scalar Gaussian autoregressive sources. P. Stavrou and J. Østergaard. Symposium on Information Theory and Signal Processing in the Benelux. Delft, The Netherlands, pp. 207 – 214, 2017.
  17. Humans do not maximize the probability of correct decision when recognizing DANTALE words in noise. Z. Jahromi, J. Østergaard, and J. Jensen, Proc. Interspeech 2017, Stockholm, Sweden, 2017, to appear.
  18. On the use of Band Importance Weighting in the Short-Time Objective Intelligibility Measure. A.H. Andersen, J.M. de Haan, Z.-H. Tan and J. Jensen, Proc. Interspeech 2017, Stockholm, Sweden, 2017, to appear.
  19. Adversarial Network Bottleneck Features for Noise Robust Speaker Verification. H. Yu, Z.-H. Tan, Z. Ma and J. Guo, Proc. Interspeech 2017, Stockholm, Sweden, 2017, to appear.
  20. Conditional Generative Adversarial Networks for Speech Enhancement and Noise-Robust Speaker Verification. D. Michelsanti and Z.-H. Tan, Proc. Interspeech 2017, Stockholm, Sweden, 2017, to appear.
  21. Improving Speaker Verification Performance in Presence of Spoofing Attacks Using Out-of-Domain Spoofed Data. A. Sarkar, Md Sahidullah, Z.-H. Tan and T. Kinnunen, Proc. Interspeech 2017, Stockholm, Sweden, 2017, to appear.
  22. A Lower Bound on the Causal and Zero Delay Rate Distortion Function for Scalar Gaussian Autoregressive Sources, P. A. Stavrou and J. Østergaard. Proceedings of the Symposium on Information Theory and Signal Processing in the Benelux, pp. 207  – 214, Vol. 2017, Delft, The Netherlands, May 2017.
  23. Permutation Invariant Training of Deep Models for Speaker-Independent Multi-Talker Speech Separation. D. Yu, M. Kolbæk, Z.-H. Tan, J. Jensen, Proc. International Conf. Audio, Speech, Signal Proc. (ICASSP), 2017.
  24. A Non-Intrusive Short-Time Objective Intelligibility Measure. A. H. Andersen, J. M. de Haan, Z.-H. Tan, and J. Jensen, Proc. International Conf. Audio, Speech, Signal Proc. (ICASSP), 2017.
  25. RedDots Replayed: A New Replay Spoofing Attack Corpus for Text-dependent Speaker Verification Research. T. Kinnunen, M. Sahidullah, M. Falcone, L. Costantini, R. Hautamaki, D. Thomsen, A. Sarkar, Z.-H. Tan, H. Delgado, M. Todisco, N. Evans, V. Hautamaki, and K.A. Lee, Proc. International Conf. Audio, Speech, Signal Proc. (ICASSP), 2017.
  26. An Asymmetric Difference Multiple Description Gaussian Noise Channel. J. Østergaard, Y. Kochman, and R. Zamir, IEEE Data Processing Conference, April, 2017.
  27. TDOA-based Self-Calibration of Dual-Microphone Arrays. M. Farmani, R. Heusdens, M. S. Pedersen, Z.-H. Tan and J. Jensen, Proc. 19th International Conference on Information Fusion (FUSION), pp. 1931 – 1936, 2016.
  28. Speech Enhancement Using Long Short-Term Memory Based Recurrent Neural Networks for Noise Robust Speaker Verification. M. Kolbæk, Z.-H. Tan, and J. Jensen,
    Proc. IEEE Spoken Language Technology Workshop, 2016.
  29. Further Optimisations of Constant Q Cepstral Processing for Integrated Utterance and Text-dependent Speaker Verification. H. Delgado, M. Todisco, M. Sahidullah, A. Sarkar, N. Evans, T. Kinnunen, and Z.-H. Tan, Proc. IEEE Spoken Language Technology Workshop, 2016.
  30. Two Asymmetric Descriptions from Many Symmetric Descriptions. A. Mashiach, Y. Kochman, J. Østergaard, and R. Zamir, International Conference on the Science of Electrical Engineering (ICSEE), 2016.
  31. Detection of Spoken Words in Noise: Comparison of Human Performance to Maximum Likelihood Detection. M. Z. Jahromi, J. Østergaard, and J. Jensen, IEEE Global Conference on Signal and Information Processing (GlobalSIP), 2016.