Δημοσίευση - Efficient combination of parametric spaces, models and metrics for speaker diarization
ΤΑΥΤΟΤΗΤΑ

Efficient combination of parametric spaces, models and metrics for speaker diarization

Ερευνητική περιοχή:  
    
Είδος:  
Άρθρο σε πρακτικά

 

Έτος: 2007
Συγγραφείς: Θέμος Σταφυλάκης; Βασίλης Κατσούρος; Γεώργιος Καραγιάννης
Τίτλος βιβλίου: Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, ASRU 2007.
Σελίδες: 256 - 261
Διεύθυνση: Kyoto, Japan
Οργανισμός: IEEE
DOI: 10.1109/ASRU.2007.4430120
Περίληψη:
In this paper we present a method of combining several acoustic parametric spaces, statistical models and distance metrics in speaker diarization task. Focusing our interest on the post-segmentation part of the problem, we adopt an incremental feature selection and fusion algorithm based on the Maximum Entropy Principle and Iterative Scaling Algorithm that combines several statistical distance measures on speech-chunk pairs. By this approach, we place the merging-of-chunks clustering process into a probabilistic framework. We also propose a decomposition of the input space according to gender, recording conditions and chunk lengths. The algorithm produced highly competitive results compared to GMM-UBM state-of-the-art methods.
[Bibtex]