Text-to-speech System With Variable Frame Rate

/Text-to-speech System With Variable Frame Rate

Abstract

A neural TTS system is trained to generate key acoustic frames at variable rates while omitting other frames. The frame skipping depends on the acoustic features to be generated for the input text. The TTS system can interpolate frames between the key frames at a target rate for a vocoder to synthesis audio samples.

Full Text

What is claimed is:

Timeline

Filed

02/19/2026

Published

06/25/2026

Granted

Not Available

IPC Codes(2)

G10L 13/047:Architecture of speech synthesisers

G10L 13/06:Elementary speech units used in speech synthesisers; Concatenation rules