A new Pitch Tracking Smoother based on Deep Neural Networks
p. 193-198
Résumés
This paper presents a new pitch tracking smoother based on deep neural networks (DNN). The proposed system has been extensively tested using two reference benchmarks for English and exhibited very good performances in correcting pitch detection algorithms outputs.
Questo contributo presenta un programma di smoothing del profilo intonativo basato su reti neurali deep. Il sistema è stato verificato utilizzando due corpora di riferimento e le sue prestazioni nella correzione degli errori di alcuni algoritmi per l’identificazione del pitch sono decisamente buone.
Texte intégral
Bibliographie
Bartosek, J. 2010 Pitch Detection Algorithm Evaluation Framework In Proceedings of 20th Czech-German Workshop on Speech Processing, Prague, 118123.
Bagshaw, P.C. 1994 Automatic prosodic analysis for computer-aided pronunciation teaching, PhD Thesis, University of Edimburgh.
Bagshaw, P.C. and Hiller, S.M. and Jack, M.A. 1993 Enhanced pitch tracking and the processing of f0 contours for computer aided intonation teaching, Proceedings of Eurospeech ’93, Berlin, 1003–1006
Camacho A. 2007 SWIPE: A sawtooth waveform inspired pitch estimator for speech and music. PhD Thesis, University of Florida.
Chu, W. and Alwan A. 2009 Reducing F0 frame error of F0 tracking algorithms under noisy conditions with an unvoiced/voiced classification frontend In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP2009, 39693972.
Chu, W. and Alwan, A. 2012. SAFE: A statistical approach to F0 estimation under clean and noisy conditions. IEEE Trans. Audio, Speech, Lang. Process., 20(3):933–944.
de Cheveigné A. and Kawahara H. 2002 YIN, a fundamental frequency estimator for speech and music Journal of the Acoustical Society of America, 111, 191730.
Gonzalez, S. and Brookes, M. 2014. PEFAC-A pitch estimation algorithm robust to high levels of noise. IEEE Trans. Audio, Speech, Lang. Process., 22(2):518–530.
Han, Kun and Wang, DeLiang 2014. Neural Network Based Pitch Tracking in Very Noisy Speech. IEEE Trans. Audio, Speech, Lang. Process., 22(12):2158–2168.
Huang, F. and Lee, T. 2012 Robust Pitch Estimation Using l1-regularized Maximum Likelihood Estimation. In Proceedings of 13th Annual Conference of the International Speech Communication Association Interspeech 2012, Portland (OR).
Jang, S.J. and Choi, S.H. and Kim, H.M. and Choi, H.S. and Yoon Y.R. 2007 Evaluation of performance of several established pitch detection algorithms in pathological voices. In Proceedings of the International Conference of the IEEE Engineering in Medicine and Biology Society - EMBC, Lyon, 620623.
Jin, Z. and Wang, L. 2011. HMM-based multipitch tracking for noisy and reverberant speech. IEEE Trans. Audio, Speech, Lang. Process., 19(5):1091–1102.
Jlassi, Wided and Bouzid, Aicha and Ellouze, Noureddine 2016 A new method for pitch smoothing, 2nd International Conference on Advanced Technologies for Signal and Image Processing, Monastir, Tunisia, 657–661.
Kellman. M. and Morgan, N. 2017 Robust Multi-Pitch Tracking: a trained classifier based approach, ICSI Tchnical Report, Berkeley, CA.
Kotnik, B. and Höge, H. and Kacic, Z. 2006 Evaluation of Pitch Detection Algorithms in Adverse Conditions In Proceedings of Speech Prosody 2006, Dresden, PS2883.
Lee, B.S. and Ellis, D. 2012 Noise Robust Pitch Tracking by Subband Autocorrelation Classification In Proceedings of 13th Annual Conference of the International Speech Communication Association Interspeech 2012, Portland (OR).
Luengo, I., Saratxaga, I., Navas, E., 2007 Evaluation of Pitch Detection Algorithm under Real Conditions. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing ICASSP 2007, Honolulu, Hawaii, 4, 10571060.
Plante, F. and Ainsworth, W.A. and Meyer, G. 1995 A Pitch Extraction Reference Database. In Proceedings of Eurospeech95, Madrid, 837840.
Rabiner, L.R. and Cheng, M.J. and Rosenberg, A.E. and McGonegal C.A. 1976 A Comparative Performance Study of Several Pitch Detection Algorithms. IEEE Transaction on Acoustics, Speech and Signal Processing, 24, 399418.
Reimers, Nils and Gurevych, Iryna. 2017 Reporting Score Distributions Makes a Difference: Performance Study of LSTM-networks for Sequence Tagging, Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, 338–348.
So, YongJin and Jia, Jia and Cai, LianHong. 2012 Analysis and Improvement of Auto-correlation Pitch Extraction Algorithm Based on Candidate Set, In Zhihong Q., Lei C., Weilian S., Tingkai W., Huamin Y. (eds) Recent Advances in Computer Science and Information Engineering: Volume 5, Springer Berlin Heidelberg, 697–702.
Talkin D. 1995 A robust algorithm for pitch tracking (RAPT). In Kleijn W.B., Paliwal, K.K. (eds) Speech Coding and Synthesis, New York: Elsevier, 495518.
Tamburini, Fabio 2013 Una valutazione oggettiva dei metodi pi diffusi per l’estrazione automatica della frequenza fondamentale. In Atti dell IX Convegno Nazionale dell’Associazione Italiana di Scienze della Voce (AISV2013), Bulzoni:Roma, 427–434.
Veprek, P. and Scordilis, M.S. 2002 Analysis, enhancement and evaluation of five pitch determination techniques. Speech Communication, 37, 249270.
Wang, D. and Loizou, P.C. 2012 Pitch Estimation Based on Long Frame Harmonic Model and Short Frame Average Correlation Coefficient. In Proceedings of 13th Annual Conference of the International Speech Communication Association Interspeech 2012, Portland (OR).
Wu, M. and Wang, L. and Brown G.J. 2003. A multipitch tracking algorithm for noisy speech. IEEE Trans. Audio, Speech, Lang. Process., 11(3):229–241.
Zahorian, S.A. and Hu, H. 2008 A Spectral/temporal method for Robust Fundamental Frequency Tracking. Journal of the Acoustical Society of America, 123, 45594571.
Zhao, Xufang and O’Shaughnessy, Douglas and Minh-Quang, Nguyen. 2007 A Processing Method for Pitch Smoothing Based on Autocorrelation and Cepstral F0 Detection Approaches, Proceedings of the International Symposium on Signals, Systems and Electronics, Montreal, Canada, 59–62
Notes de bas de page
Auteurs
FICLIT, University of Bologna, Italy – lele.ferro4[at]gmail.com
FICLIT, University of Bologna, Italy – fabio.tamburini[at]unibo.it
Le texte seul est utilisable sous licence Licence OpenEdition Books. Les autres éléments (illustrations, fichiers annexes importés) sont « Tous droits réservés », sauf mention contraire.
Proceedings of the Second Italian Conference on Computational Linguistics CLiC-it 2015
3-4 December 2015, Trento
Cristina Bosco, Sara Tonelli et Fabio Massimo Zanzotto (dir.)
2015
Proceedings of the Third Italian Conference on Computational Linguistics CLiC-it 2016
5-6 December 2016, Napoli
Anna Corazza, Simonetta Montemagni et Giovanni Semeraro (dir.)
2016
EVALITA. Evaluation of NLP and Speech Tools for Italian
Proceedings of the Final Workshop 7 December 2016, Naples
Pierpaolo Basile, Franco Cutugno, Malvina Nissim et al. (dir.)
2016
Proceedings of the Fourth Italian Conference on Computational Linguistics CLiC-it 2017
11-12 December 2017, Rome
Roberto Basili, Malvina Nissim et Giorgio Satta (dir.)
2017
Proceedings of the Fifth Italian Conference on Computational Linguistics CLiC-it 2018
10-12 December 2018, Torino
Elena Cabrio, Alessandro Mazzei et Fabio Tamburini (dir.)
2018
EVALITA Evaluation of NLP and Speech Tools for Italian
Proceedings of the Final Workshop 12-13 December 2018, Naples
Tommaso Caselli, Nicole Novielli, Viviana Patti et al. (dir.)
2018
EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020
Proceedings of the Seventh Evaluation Campaign of Natural Language Processing and Speech Tools for Italian Final Workshop
Valerio Basile, Danilo Croce, Maria Maro et al. (dir.)
2020
Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020
Bologna, Italy, March 1-3, 2021
Felice Dell'Orletta, Johanna Monti et Fabio Tamburini (dir.)
2020
Proceedings of the Eighth Italian Conference on Computational Linguistics CliC-it 2021
Milan, Italy, 26-28 January, 2022
Elisabetta Fersini, Marco Passarotti et Viviana Patti (dir.)
2022