Browsing by Author "Málek, Jiří"
Now showing 1 - 9 of 9
Results Per Page
Sort Options
- ItemAutomatic classifiers for medical data from doppler unit(Czech Technical University, 2007-01-01) Málek, Jiří; Nouza, Jan; Klimovič, TomášNowadays, hand-held ultrasonic Doppler units are often used for noninvasive screening of atherosclerosis in arteries of the lower limbs. The mean velocity of blood flow in time and blood pressures are measured on several positions on each lower limb. This project presents soft-ware that is able to analyze such data and classify it in real time into selected diagnostic classes. It is also capable of giving a notice of some errors encountered during meas-uring. At the Department of Functional Diagnostics in the Regional Hospital of Liberec a database of several hun-dreds signals was collected. In cooperation with the spe-cialist, the signals were manually classified into four classes. Consequently selected signal features were ex-tracted and used for training a distance and a Bayesian classifier. Another set of signals was used for evaluating and optimizing the parameters of the classifiers. This paper compares the results of the software with those provided by a human expert. They agreed in 89 % cases.
- ItemBlind audio source separation via independent component analysis(Technická Univerzita v Liberci, 2010-01-01) Málek, Jiří
- ItemBlock-online multi-channel speech enhancement using deep neural network-supported relative transfer function estimates(Institution of Engineering and Technology, 2020-05-01) Málek, Jiří; Koldovský, Zbyněk; Boháč, MarekThis work addresses the problem of block-online processing for multi-channel speech enhancement. Such processing is vital in scenarios with moving speakers and/or when short utterances are processed, e.g. in voice assistant applications. We consider several variants of a system that performs beamforming supported by deep neural network-based voice activity detection followed by post-filtering. The speaker is targeted through estimating relative transfer functions between microphones. Each block of the input signals is processed independently to make the method applicable in highly dynamic environments. Due to short processed blocks, the statistics required by the beamformer are estimated less precisely. The influence of this inaccuracy is studied and compared to batch processing regime, when recordings are treated as one block. The experimental evaluation is performed on large datasets of CHiME-4 and another dataset featuring moving target speaker. The experiments are evaluated in terms of objective and perceptual criteria. Moreover, word error rate (WER) of a speech recognition system is evaluated, for which the method serves as a front-end. The results indicate that the proposed method is robust for short length of the processed block. Significant improvements in terms of the criteria and WER are observed even for the block length of 250 ms.
- ItemCompensation of real-world distortions in speech signals(2023-01-01) Málek, JiříThis habilitation thesis focuses on the compensation of various distortions encountered in real-world speech recordings. The thesis is organized as a collection of articles concerning this problem and published by the author between 2013 and 2022. The manuscripts were created as the output of several consecutive research projects provided by the GAČR and TAČR funding agencies. The articles follow three research topics. The main topic is the extraction of a target speaker from a mixture of several sound sources. The other topic is robust automatic speech recognition. The transcription can be complicated by unwanted sounds in the speech recording or an insufficient amount of suitable training data. Finally, the compensation of nonlinear distortions in acoustic echo cancellation is addressed.
- ItemEXTRACTION OF INDEPENDENT VECTOR COMPONENT FROM UNDERDETERMINED MIXTURES THROUGH BLOCK-WISE DETERMINED MODELING(2019-05-01) Koldovský, Zbyněk; Málek, Jiří; Janský, JakubWe propose a new model for blind source extraction where the source of interest is assumed to be static while the background noise is dynamic. The model is determined within short blocks (the same number of sources as that of sensors), however, the noise subspace can be changing from block to block. We propose a gradient-based algorithm that jointly extracts an independent vector component from a set of mixtures obeying the model based on maximum quasi-likelihood principle. Simulations confirm the validity of the approach, and experiments with real-world recordings show promising results.
- ItemMěření průběhu řezných sil a teplot soustružení přerušovaným řezem(1973-01-01) Málek, Jiří
- ItemOn Practical Aspects of Multi-condition Training Based on Augmentation for Reverberation-/Noise-Robust Speech Recognition(Springer Verlag, 2019-01-01) Málek, Jiří; Ždánský, JindřichMulti-condition training achieved through data augmentation belongs to the most successful techniques for noise/reverberation-robust automatic speech recognition (ASR). Its basic principle, i.e., generation of artificially distorted speech signals, is well documented in the literature. However, the specific choice of hyper-parameters for the generation process and its influence on the results of the subsequent ASR is usually not discussed in detail. Often, it is simply assumed that the augmentation should include as many acoustic conditions as possible. When designed in this broad manner, the computational/storage demands of the augmentation process grow rapidly. In this paper, we rather aim for careful selection of a limited number of acoustic conditions that are highly relevant with respect to the target environment. In this manner, we keep the computational requirements feasible, while retaining the improved accuracy of the augmented models. We experimentally analyze two augmentation scenarios and draw conclusions regarding suitable setup choices. The first case concerns augmentation for reverberation-robust ASR. We propose to exploit Clarity C50 as a feature for selection of Room Impulse Responses (RIRs) crucial for the augmentation. We show that mismatches in other RIR-related parameters, such as Reverberation Time T60 or the room dimension, have small influence on ASR accuracy, as long as the training-test conditions are matched from the C50 perspective. Subsequently, we investigate the augmentation for noise-reverberation-robust ASR. We discuss selection of Signal-to-Noise Ratio (SNR), the type of noise and reverberation level of speech. We observe the influence of mismatches in these parameters on the ASR accuracy. © 2019, Springer Nature Switzerland AG.
- ItemRobustní odhad odstupu řeči od šumu pomocí hlubokých neuronových sítí(Technická Univerzita v Liberci, 2016-01-01) Mužíček, Michal; Málek, JiříPráce se zabývá tvorbou neuronové sítě, která je schopná, i přes výskyt různorodého šumu, odhadnout, kde se v řečové nahrávce vyskytuje řeč. Jako vstupní data pro trénování neuronové sítě slouží databáze aditivní směsi šumu a čistých řečových nahrávek. Data zpracovaná neuronovou sítí jsou následně předána algoritmu, který vypočítá odhad odstupu řeči od šumu. Správnost výstupu navrženého algoritmu je hodnocena dle porovnání s konkurenční metodou WADA. Výsledné hodnoty naznačují, že využití neuronových sítí pro detekci přítomnosti řeči a následného odhadu SNR úrovně jsou reálnou alternativou existujícím metodám.
- ItemZlepšování řečových nahrávek pořízených v reálném prostředí(Technická Univerzita v Liberci, 2016-01-01) Kounovský, Tomáš; Málek, JiříTato práce se zabývá problematikou zlepšování řečových nahrávek reálného charakteru odstraněním nežádoucího šumu v logaritmické výkonové spektrální oblasti pomocí hlubokých neuronových sítí. Velká databáze čistých řečových nahrávek je aditivně degradována několika druhy šumu na různých úrovních SNR a slouží jako trénovací sada pro dopřednou hlubokou neuronovou síť. Dvě sítě jsou natrénovány na odlišně sestavených trénovacích sadách tak, aby se naučily vztah mezi zašuměnou a čistou řečí. Jejich schopnosti odstranění šumu z nahrávek jsou otestovány ve známých i neznámých šumových podmínkách a porovnány s některými konvenčními metodami. Sítě jsou otestovány i v neznámých řečových podmínkách pomocí malé testovací databáze české řeči. Výsledky ukazují, že síť natrénovaná na databázi obsahující více druhů šumu je robustnější při zlepšování řeči degradované neznámým druhem šumu. Zároveň je tato síť schopna konkurovat konvenčním metodám při odstraňování šumu stacionárního charakteru a překonat je při odstraňování šumu silně nestacionárního.