Database of speech recordings for comparative analysis of multi-language phonems

Mariusz Mąsior, Magdalena Igras, Mariusz Ziółko, Stanisław Kacprzak


The paper presents a system of collecting and analyzing multi-language speech samples for research on characteristics of phonemes in several hundred world languages. We describe the implementation: database and webpage. The content and form of the database and applications for development of the new methods of speech analysis are presented.


multi-language speech recordings; speech analysis

Full Text:

PDF (Polski)


Atkinson Q. D.: Phonetic Diversity Supports a Serial Founder Effect Model of Language Expansion from Africa. Science, 32, 2011, s. 346÷348.

Strona internetowa bazy nagrań UCLA:

Strona internetowa projektu Endangered Languages:

Strona internetowa projektu GRN:

Lewis M. P.: Ethnologue. Languages of the World. 16th Edition, SIL International, Dallas, Texas 2009, online version:

O’Reilly T.: What Is Web 2.0. Design Patterns and Business Models for the Next Generation of Software. O’Reilly Media, 2009.

Huang X., Acero A., Hon H.: Spoken Language Processing: A Guide to Theory, Algorithm, and System Development. Prentice Hall PTR, Upper Saddle River, NJ, USA 2001.

Kollmeier B., Brand T., Meyer B.: Springer Handbook of Speech Processing: Perception of Speech and Sound. Springer, Berlin-Heidelberg2008.

Saunders J.: Real time discrimination of broadcast speech/music. Proc. 1996 ICASSP, 1996, s. 993÷996.

Scheirer E., Slaney M.: Construction and evaluation of a robust multifeature speech/music discriminator. Proc. IEEE Int. Conf. on Acoustic, Speech and Signal Processing, ICASSP, 1997, s. 1331÷1334.

Ziółko M., Gałka J., Ziołko B., Jadczyk T., Skurzok D., Mąsior M.: Automatic speech recognition system dedicated for Polish. Proceedings of Interspeech, Show and tell session, Florence 2011.

Iser B., Minker W., Schmidt G.: Bandwidth Extension of Speech Signals. Volume 13 of Lecture Notes in Electrical Engineering, Springer, 2008.