Python Mel Spectrogram

WavTTS: Towards High-Quality Zero-Shot TTS via Direct Raw Waveform Modeling

WavTTS is an end-to-end zero-shot TTS framework that generates speech directly in the raw waveform space, without relying on intermediate acoustic representations such as mel-spectrograms, VAE latents ...

IEEE

Discrete Audio Representation as an Alternative to Mel-Spectrograms for Speaker and Speech Recognition

Abstract: Discrete audio representation, aka audio tokenization, has seen renewed interest driven by its potential to facilitate the application of text language modeling approaches in audio domain.

IEEE

EDL-Det: A Robust TTS Synthesis Detector Using VGG19-Based YAMNet and Ensemble Learning Block

Abstract: Various audio deep fake synthesis algorithms exist, such as deep voice, tacotron, fastspeech, and imitation techniques. Despite the existence of various spoofing speech detectors, they are ...

GitHub

WhaleNet (Wavelet Highly Adaptive Learning Ensemble Network)

Customer stories Events & webinars Ebooks & reports Business insights GitHub Skills ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results