Speech annotator

AGH Annotator is a software program designed to prepare the collection of speech recordings for use in the corpora of speech. Such sets must include not only audio files but also the corresponding transcripts with instances times of units such as sentences, words or phonemes. For this reason, the corpora of speech are costly and time-consuming resources in the preparation.


The program allows quickly and inexpensive generating of corpora. This can be achieved for example by using existing audio books and their textual counterparts, documents or recordings of lectures, speeches, etc. Of course, nothing prevents to make own recordings and annotate them by the program. However, the intention of the creators of the program, was to bypass the costs associated with paying the speakers, recording equipment purchasing, etc. In addition, the program annotate in such a way that the user can perform his job as the least number of clicks, so as fast as possible. The work carried out on average devoted 17 minutes to 1 minute of the corpus. This is a very good result, since the literature is estimated that time to 20 to 40 minutes. The program exports data in standard MLF files, used in many systems, speech technology, including the most famous - HTK.

A paper and Eng. thesis (in Polish) about the annotator.

Authors invite people interested in audio processing technologies to contact spin-off company techmo.pl

Copyright © Zespół Przetwarzania Sygnałów AGH 2011-2014