Browsing by Author "Matějů Lukáš"
Now showing 1 - 10 of 10
Results Per Page
Sort Options
- ItemAn Approach to Online Speaker Change Point Detection Using DNNs and WFSTs(ISCA, 2019-01-01) Matějů Lukáš; Červa Petr; Žďánský Jindřich
- ItemAutomatic Development of ASR System for an Under-Resourced Language(IEEE, 2018-01-01) Šafařík Radek; Matějů Lukáš
- ItemAutomatic Syllabification and Syllable Timing of Automatically Recognized Speech - for Czech(Springer International Publishing, 2016-01-01) Boháč Marek; Matějů Lukáš; Rott Michal; Šafařík Radek
- ItemThe Impact of Inaccurate Phonetic Annotations on Speech Recognition Performance(Springer Verlag, 2017-01-01) Šafařík Radek; Matějů Lukáš
- ItemImpact of phonetic annotation precision on automatic speech recognition systems(Institute of Electrical and Electronics Engineers Inc., 2016-01-01) Šafařík Radek; Matějů Lukáš
- ItemThe Influence of Errors in Phonetic Annotations on Performance of Speech Recognition System(Springer Verlag, 2018-01-01) Šafařík Radek; Matějů Lukáš; Weingartová Lenka
- ItemInvestigation into the use of deep neural networks for LVCSR of Czech(IEEE, 2015-01-01) Matějů Lukáš; Červa Petr; Žďánský Jindřich
- ItemInvestigation into the Use of WFSTs and DNNs for Speech Activity Detection in Broadcast Data Transcription(Springer Verlag, 2017-01-01) Matějů Lukáš; Červa Petr; Žďánský Jindřich
- ItemSpeech Activity Detection in Online Broadcast Transcription Using Deep Neural Networks and Weighted Finite State Transducers(Institute of Electrical and Electronics Engineers Inc., 2017-01-01) Matějů Lukáš; Červa Petr; Žďánský Jindřich; Málek JiříIn this paper, a new approach to online Speech Activity Detection (SAD) is proposed. This approach is designed for the use in a system that carries out 24/7 transcription of radio/TV broadcasts containing a large amount of non-speech segments, such as advertisements or music. To improve the robustness of detection, we adopt Deep Neural Networks (DNNs) trained on artificially created mixtures of speech and non-speech signals at desired levels of signal-to-noise ratio (SNR). An integral part of our approach is an online decoder based on Weighted Finite State Transducers (WFSTs); this decoder smooths the output from DNN. The employed transduction model is context-based, i.e., both speech and non-speech events are modeled using sequences of states. The presented experimental results show that our approach yields state-of-the-art results on standardized QUT-NOISE-TIMIT data set for SAD and, at the same time, it is capable of a) operating with low latency and b) reducing the computational demands and error rate of the target transcription system.
- ItemStudy on the use of deep neural networks for speech activity detection in broadcast recordings(SciTePress, 2016-01-01) Matějů Lukáš; Červa Petr; Žďánský Jindřich