scorers

71886b2e43

Seamless November release. (#221)

5 months ago

README.md

d64d1b738a

Update README.md (#262)

5 months ago

__init__.py

71886b2e43

Seamless November release. (#221)

5 months ago

evaluate.py

6ab3787931

Introduce expressivity_predict, and change pretssel_inference to expressivity_evaluate. (#251)

5 months ago

You have to be logged in to leave a comment.

Evaluating SeamlessStreaming and Seamless models

SeamlessStreaming is the streaming only model and Seamless is the expressive streaming model.

Quick start:

Evaluation can be run with the streaming_evaluate CLI.

We use the seamless_streaming_unity for loading the speech encoder and T2U models, and seamless_streaming_monotonic_decoder for loading the text decoder for streaming evaluation. This is already set as defaults for the streaming_evaluate CLI, but can be overridden using the --unity-model-name and --monotonic-decoder-model-name args if required.

Note that the numbers in our paper use single precision floating point format (fp32) for evaluation by setting --dtype fp32. Also note that the results from running these evaluations might be slightly different from the results reported in our paper (which will be updated soon with the new results).

S2TT:

Set the task to s2tt for evaluating the speech-to-text translation part of the SeamlessStreaming model.

streaming_evaluate --task s2tt --data-file <path_to_data_tsv_file> --audio-root-dir <path_to_audio_root_directory> --output <path_to_evaluation_output_directory> --tgt-lang <3_letter_lang_code>

Note: The --ref-field can be used to specify the name of the reference column in the dataset.

ASR:

Set the task to asr for evaluating the automatic speech recognition part of the SeamlessStreaming model. Make sure to pass the source language as the --tgt-lang arg.

streaming_evaluate --task asr --data-file <path_to_data_tsv_file> --audio-root-dir <path_to_audio_root_directory> --output <path_to_evaluation_output_directory> --tgt-lang <3_letter_source_lang_code>

S2ST:

SeamlessStreaming:

Set the task to s2st for evaluating the speech-to-speech translation part of the SeamlessStreaming model.

streaming_evaluate --task s2st --data-file <path_to_data_tsv_file> --audio-root-dir <path_to_audio_root_directory> --output <path_to_evaluation_output_directory> --tgt-lang <3_letter_lang_code>

Seamless:

The Seamless model is a unified model for streaming expressive speech-to-speech translation. Use the --expressive arg for running evaluation of this unified model.

streaming_evaluate --task s2st --data-file <path_to_data_tsv_file> --audio-root-dir <path_to_audio_root_directory> --output <path_to_evaluation_output_directory> --tgt-lang <3_letter_lang_code> --expressive --gated-model-dir <path_to_vocoder_checkpoints_dir>

The Seamless model uses vocoder_pretssel which is a 24KHz version (vocoder_pretssel) by default. In the current version of our paper, we use 16KHz version (vocoder_pretssel_16khz) for the evaluation, so in order to reproduce those results please add this arg to the above command: --vocoder-name vocoder_pretssel_16khz.

vocoder_pretssel or vocoder_pretssel_16khz checkpoints are gated, please check out this section to acquire these checkpoints. Also, make sure to add --gated-model-dir <path_to_vocoder_checkpoints_dir>

Tip!

Press p or to see the previous file or, n or to see the next file

README.md

Evaluating SeamlessStreaming and Seamless models

Quick start:

S2TT:

ASR:

S2ST:

SeamlessStreaming:

Seamless:

Comments

Use Google Cloud Storage!

Specify your Google Storage bucket

Service Account Key

Congratulations!

Use AWS S3 as storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

Use any S3 compatible storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

Use Azure Cloud Storage!

Specify your Azure Storage bucket

Access key (If needed)

Congratulations!

DagsHub-Science / seamless_communication mirror of https://github.com/facebookresearch/seamless_communication

README.md

Evaluating SeamlessStreaming and Seamless models

Quick start:

S2TT:

ASR:

S2ST:

SeamlessStreaming:

Seamless:

Comments

Use Google Cloud Storage!

Specify your Google Storage bucket

Service Account Key

Congratulations!

Use AWS S3 as storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

Use any S3 compatible storage!

Specify your S3 bucket

Access key (If needed)

Congratulations!

Use Azure Cloud Storage!

Specify your Azure Storage bucket

Access key (If needed)

Congratulations!

DagsHub-Science
/
seamless_communication
mirror of https://github.com/facebookresearch/seamless_communication