Audio samples from Team Lab Phonetics 2021, TTS with Prosody Control

Thomas Bott and Sebastian Sammet, University of Stuttgart, Supervisor: Florian Lux

Baseline Model trained on ~19h of Audio from the Blizzard Challenge 2013

"Can I help you with your project?"

[No prosody control]	-
-

Prosody Control Model trained on ~19h of Audio from the Blizzard Challenge 2013. The Model is conditioned on 7 Prosodic Parameters: Duration, Average Pitch, Minimum Pitch, Maximum Pitch, Average Energy, Minimum Energy, Maximum Energy

"Can I help you with your project?"

Prosodic Parameter	-1.0	-0.8	-0.6	-0.4	-0.2	0.0	0.2	0.4	0.6	0.8	1.0
Duration
Pitch
Pitch Range (gets less with increasing value)
Energy
Energy Range (gets less with increasing value)