Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel
Kaushik Ram Sadagopan 71886b2e43
Seamless November release. (#221)
5 months ago
..
71886b2e43
Seamless November release. (#221)
5 months ago
71886b2e43
Seamless November release. (#221)
5 months ago
71886b2e43
Seamless November release. (#221)
5 months ago

README.md

You have to be logged in to leave a comment. Sign In

Convert raw audio into units (unit_extraction)

Raw audio needs to be converted to units to train UnitY models and vocoders. Units act as supervision for UnitY models, and are the input to the vocoders which synthesize speech from these units.

The unit extraction pipeline comprises the following steps:

  • Compute features from layer 35 (determined empirically) of the pretrained XLSR v2 model (paper), which is a wav2vec2 model at the core.
  • Assign features for each timestep to a collection of precomputed K-Means centroids to produce a sequence of units similar to extracting Hubert units as described in this paper.

Quick start:

audio_to_units is run with the CLI, from the root directory of the repository.

m4t_audio_to_units <path_to_input_audio>

audio_to_units calls for UnitExtractor which provides a predict method to convert an audio to units.

The convenience method resynthesize_audio of UnitExtractor, can be used to resynthesize audio waveforms from units.

Tip!

Press p or to see the previous file or, n or to see the next file

Comments

Loading...