default search action
Speech Communication, Volume 53
Volume 53, Number 1, January 2011
- Wooil Kim, Richard M. Stern
Mask classification for missing-feature reconstruction for robust speech recognition in unknown background noise. 1-11 - Monja A. Knoll, Lisa Scharrer, Alan Costall:
"Look at the shark": Evaluation of student- and actress-produced standardised sentences of infant- and foreigner-directed speech. 12-22 - Anna Hjalmarsson:
The additive effect of turn-taking cues in human and synthetic voice. 23-35 - Hiroki Mori
, Tomoyuki Satake, Makoto Nakamura, Hideki Kasuya:
Constructing a spoken dialogue corpus for studying paralinguistic information in expressive conversation and analyzing its statistical/acoustic characteristics. 36-50 - Anthony P. Stark, Kuldip K. Paliwal
Use of speech presence uncertainty with MMSE spectral energy estimation for robust automatic speech recognition. 51-61 - Martin Raab, Rainer Gruhn, Elmar Nöth
A scalable architecture for multilingual speech recognition on embedded devices. 62-74 - Linsen Loots, Thomas Niesler:
Automatic conversion between pronunciations of different English accents. 75-84 - Alexandros Lazaridis
, Iosif Mporas, Todor Ganchev
, George K. Kokkinakis, Nikos Fakotakis:
Improving phone duration modelling using support vector regression fusion. 85-97 - Prasanta Kumar Ghosh
, Shrikanth S. Narayanan:
Joint source-filter optimization for robust glottal source estimation in the presence of shimmer and jitter. 98-109 - Vijendra Raj Apsingekar, Phillip L. De Leon
Speaker verification score normalization using speaker model clusters. 110-118 - Man-Wai Mak
, Wei Rao:
Utterance partitioning with acoustic vector resampling for GMM-SVM speaker verification. 119-130 - Ali Alpan, Youri Maryn, Abdellah Kacha
, Francis Grenez, Jean Schoentgen:
Multi-band dysperiodicity analyses of disordered connected speech. 131-141
Volume 53, Number 2, February 2011
- Marijn Huijbregts, Franciska de Jong:
Robust speech/non-speech classification in heterogeneous multimedia content. 143-153 - P. Krishnamoorthy, S. R. Mahadeva Prasanna:
Enhancement of noisy speech by temporal and spectral processing. 154-174 - Ruili Wang, Jingli Lu:
Investigation of golden speakers for second language learners from imitation preference perspective by voice modification. 175-184 - Bryce E. Lobdell, Jont B. Allen, Mark Hasegawa-Johnson:
Intelligibility predictors and neural representation of speech. 185-194 - Abeer Alwan, Jintao Jiang, Willa S. Chen:
Perception of place of articulation for plosives and fricatives in noise. 195-209 - Adam Borowicz
, Alexander A. Petrovsky:
Signal subspace approach for psychoacoustically motivated speech enhancement. 210-219 - Julia Feld, Mitchell Sommers:
There goes the neighborhood: Lipreading and the structure of the mental lexicon. 220-228 - Peng Dai
, Ing Yann Soon:
A temporal warped 2D psychoacoustic modeling for robust speech recognition system. 229-241 - Geoffrey Stewart Morrison
A comparison of procedures for the calculation of forensic likelihood ratios from acoustic-phonetic data: Multivariate kernel density (MVKD) versus Gaussian mixture model-universal background model (GMM-UBM). 242-256 - Yun Lei, John H. L. Hansen:
Mismatch modeling and compensation for robust speaker verification. 257-268
Volume 53, Number 3, March 2011
- Eero Väyrynen, Juhani Toivanen, Tapio Seppänen:
Classification of emotion in spoken Finnish using vowel-length segments: Increasing reliability with a fusion technique. 269-282 - Koichi Shinoda, Yasushi Watanabe, Kenji Iwata, Yuan Liang, Ryuta Nakagawa, Sadaoki Furui:
Semi-synchronous speech and pen input for mobile user interfaces. 283-291 - Bianca Vieru, Philippe Boula de Mareüil, Martine Adda-Decker:
Characterisation and identification of non-native French accents. 292-310 - Catherine Mayo, Robert A. J. Clark, Simon King
Listeners' weighting of acoustic cues to synthetic speech naturalness: A multidimensional scaling analysis. 311-326 - Kuldip K. Paliwal
, Belinda Schwerin
, Kamil K. Wójcicki:
Role of modulation magnitude and phase spectrum towards speech intelligibility. 327-339 - Jianfen Ma, Philipos C. Loizou:
SNR loss: A new objective measure for predicting the intelligibility of noise-suppressed speech. 340-354 - Stephen So
, Kuldip K. Paliwal
Suppressing the influence of additive noise on the Kalman gain for low residual noise speech enhancement. 355-378 - Jean C. Krause, Katherine A. Pelley-Lopez, Morgan P. Tessler:
A method for transcribing the manual components of Cued Speech. 379-389 - Jedrzej Kocinski
, Pawel Libiszewski, Aleksander Sek:
Spatial efficiency of blind source separation based on decorrelation - subjective and objective assessment. 390-402 - Anthony P. Stark, Kuldip K. Paliwal
MMSE estimation of log-filterbank energies for robust speech recognition. 403-416 - Ingrid Hoonhorst, Victoria Medina, Cécile Colin, E. Markessis, Monique Radeau, Paul Deltenre, Willy Serniclaes:
Categorical perception of voicing, colors and facial expressions: A developmental study. 417-430 - Rupal Patel, Catherine McNab:
Displaying prosodic text to enhance expressive oral reading. 431-441 - Adriana Stan
, Junichi Yamagishi, Simon King
, Matthew P. Aylett:
The Romanian speech synthesis (RSS) corpus: Building a high quality HMM-based speech synthesis system using a high sampling rate. 442-450
Volume 53, Number 4, April 2011
- Wooil Kim, John H. L. Hansen:
Variational noise model composition through model perturbation for robust speech recognition with time-varying background noise. 451-464 - Kuldip K. Paliwal
, Kamil K. Wójcicki, Benjamin J. Shannon:
The importance of phase in speech enhancement. 465-494 - Ching-Ta Lu:
Enhancement of single channel speech using perceptual-decision-directed approach. 495-507 - Jie Gao, Qingwei Zhao, Yonghong Yan:
Towards precise and robust automatic synchronization of live speech and its transcripts. 508-523 - Tariqullah Jan, Wenwu Wang, DeLiang Wang:
A multistage approach to blind separation of convolutive speech mixtures. 524-539 - Phu Ngoc Le, Eliathamby Ambikairajah
, Julien Epps
, Vidhyasaharan Sethu
, Eric H. C. Choi:
Investigation of spectral centroid features for cognitive load classification. 540-551 - Milan Legát, Jindrich Matousek
, Daniel Tihelka
On the detection of pitch marks using a robust multi-phase algorithm. 552-566 - Gopal Ananthakrishnan, Olov Engwall
Mapping between acoustic and articulatory gestures. 567-589
Volume 53, Number 5, May - June 2011
- Martin Heckmann
, Bhiksha Raj, Paris Smaragdis:
Preface. 591 - Mathias Dietz, Stephan Dieter Ewert
, Volker Hohmann:
Auditory model based direction estimation of concurrent speakers from binaural signals. 592-605 - Ron J. Weiss, Michael I. Mandel, Daniel P. W. Ellis:
Combining localization cues and source model constraints for binaural source separation. 606-621 - Yan-Chen Lu, Martin Cooke:
Motion strategies for binaural localisation of speech sources in azimuth and distance by artificial listeners. 622-642 - Ramin Pichevar, Hossein Najaf-Zadeh, Louis Thibault, Hassan Lahdili:
Auditory-inspired sparse representation of audio signals. 643-657 - Jonathan Le Roux, Hirokazu Kameoka, Nobutaka Ono
, Alain de Cheveigné
, Shigeki Sagayama:
Computational auditory induction as a missing-data model-fitting problem with Bregman divergence. 658-676 - Junfeng Li, Shuichi Sakamoto, Satoshi Hongo, Masato Akagi, Yôiti Suzuki:
Two-stage binaural speech enhancement with Wiener filter for high-quality speech communication. 677-689 - Jörg-Hendrik Bach
, Jörn Anemüller, Birger Kollmeier:
Robust speech detection in real acoustic backgrounds with perceptually motivated features. 690-706 - Hui Yin, Volker Hohmann, Climent Nadeu
Acoustic features for speech recognition based on Gammatone filterbank and instantaneous frequency. 707-715 - Yotaro Kubo, Shigeki Okawa, Akira Kurematsu, Katsuhiko Shirai:
Temporal AM-FM combination for robust speech recognition. 716-725 - Maria E. Markaki
, Yannis Stylianou:
Discrimination of speech from nonspeeech in broadcast news based on modulation frequency features. 726-735 - Martin Heckmann
, Xavier Domont, Frank Joublin, Christian Goerick:
A hierarchical framework for spectro-temporal feature extraction. 736-752 - Bernd T. Meyer, Birger Kollmeier:
Robustness of spectro-temporal features against intrinsic and extrinsic variations in automatic speech recognition. 753-767 - Siqing Wu, Tiago H. Falk
, Wai-Yip Chan:
Automatic speech emotion recognition using modulation spectral features. 768-785 - Francesc Alías
, Lluís Formiga
, Xavier Llorà:
Efficient and reliable perceptual weight tuning for unit-selection text-to-speech synthesis based on active interactive genetic algorithms: A proof-of-concept. 786-800
Volume 53, Number 6, July 2011
- David R. Beukelman, Jana Childes, Tom Carrell, Trisha Funk, Laura J. Ball, Gary L. Pattee:
Perceived attention allocation of listeners who transcribe the speech of speakers with amyotrophic lateral sclerosis. 801-806 - Amy Irwin, Michael Pilling, Sharon M. Thomas:
An analysis of British regional accent and contextual cue effects on speechreading performance. 807-817 - Stephen So
, Kuldip K. Paliwal
Modulation-domain Kalman filtering for single-channel speech enhancement. 818-829 - Florian Müller, Alfred Mertins:
Contextual invariant-integration features for improved speaker-independent speech recognition. 830-841 - Yongqiang Feng, Grace J. Hao, Steve A. Xue, Ludo Max
Detecting anticipatory effects in speech articulation by means of spectral coefficient analyses. 842-854 - Thomas Drugman, Baris Bozkurt
, Thierry Dutoit:
Causal-anticausal decomposition of speech using complex cepstrum for glottal source estimation. 855-866 - Alexander M. Goberman, Stephanie Hughes, Todd Haydock:
Acoustic characteristics of public speaking: Anxiety and practice effects. 867-876 - Joseph D. W. Stephens
, Lori L. Holt:
A standard set of American-English voiced stop-consonant stimuli from morphed natural speech. 877-888 - Eren Akdemir, Tolga Çiloglu
Bimodal automatic speech segmentation based on audio and visual information fusion. 889-902 - Garreth Prendergast
, Sam R. Johnson, Gary G. R. Green:
Extracting amplitude modulations from speech in the time domain. 903-913 - Kai Yu, Heiga Zen
, François Mairesse, Steve J. Young:
Context adaptive training with factorized decision trees for HMM-based statistical parametric speech synthesis. 914-923 - Kalle J. Palomäki, Guy J. Brown
A computational model of binaural speech recognition: Role of across-frequency vs. within-frequency processing and internal noise. 924-940 - Frank Zimmerer
, Mathias Scharinger, Henning Reetz:
When BEAT becomes HOUSE: Factors of word final /t/-deletion in German. 941-954
Volume 53, Number 7, September 2011
- Trevor H. Chen, Dominic W. Massaro
Evaluation of synthetic and natural Mandarin visual speech: Initial consonants, single vowels, and syllables. 955-972 - Takashi Nose
, Takao Kobayashi:
Speaker-independent HMM-based voice conversion using adaptive quantization of the fundamental frequency. 973-985 - Jianxin Peng, Chengxun Bei, Haitao Sun:
Relationship between Chinese speech intelligibility and speech transmission index in rooms based on auralization. 986-990
Volume 53, Number 8, October 2011
- Philip N. Garner
Cepstral normalisation and the signal to noise ratio spectrum in automatic speech recognition. 991-1001 - Luis Fernando D'Haro
, Ricardo de Córdoba
, Rubén San Segundo
, Javier Ferreiros
, José Manuel Pardo:
Design and evaluation of acceleration strategies for speeding up the development of dialog applications. 1002-1025 - Tatyana Polyakova, Antonio Bonafonte
Introducing nativization to Spanish TTS systems. 1026-1041 - Lisa Davidson:
Characteristics of stop releases in American English spontaneous speech. 1042-1058
Volume 53, Numbers 9-10, November - December 2011
- Björn W. Schuller
, Anton Batliner, Stefan Steidl
Introduction to the special issue on sensing emotion and affect - Facing realism in speech processing. 1059-1061 - Björn W. Schuller
, Anton Batliner, Stefan Steidl
, Dino Seppi:
Recognising realistic emotions and affect in speech: State of the art and lessons learnt from the first challenge. 1062-1087 - Raul Fernandez, Rosalind W. Picard:
Recognizing affect from speech prosody using hierarchical graphical models. 1088-1103 - Simon Worgan, Roger K. Moore
Towards the detection of social dominance in dialogue. 1104-1114 - Katherine Forbes-Riley, Diane J. Litman:
Benefits and challenges of real-time uncertainty detection and adaptation in a spoken dialogue computer tutor. 1115-1136 - Jaime C. Acosta, Nigel G. Ward:
Achieving rapport with turn-by-turn, user-responsive emotional coloring. 1137-1148 - Ammar Mahdhaoui, Mohamed Chetouani
Supervised and semi-supervised infant-directed speech classification for parent-infant interaction analysis. 1149-1161 - Chi-Chun Lee
, Emily Mower
, Carlos Busso
, Sungbok Lee, Shrikanth S. Narayanan:
Emotion recognition using a hierarchical binary decision tree approach. 1162-1171 - Marcel Kockmann, Lukás Burget
, Jan Cernocký
Application of speaker- and language identification state-of-the-art techniques for emotion recognition. 1172-1185 - Elif Bozkurt, Engin Erzin
, Çigdem Eroglu Erdem
, A. Tanju Erdem
Formant position based weighted spectral features for emotion recognition. 1186-1197 - Tim Polzehl, Alexander Schmitt, Florian Metze
, Michael Wagner:
Anger recognition in speech using acoustic and linguistic cues. 1198-1209 - Ramón López-Cózar
, Jan Silovský, Martin Kroul:
Enhancement of emotion detection in spoken dialogue systems by combining several information sources. 1210-1228

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.