Speech To Text - ModelsLab

Request

Make a POST request to below endpoint and pass the required parameters as a request body.

curl

--request POST 'https://modelslab.com/api/v1/enterprise/voice/speech_to_text' \

Body

json

{
    "key": "enterprise_api_key",
    "init_audio": "https://pub-f3505056e06f40d6990886c8e14102b2.r2.dev/audio/tom_hanks_1.wav",
    "language": "en",
    "timestamp_level": null,
    "webhook": null,
    "track_id": null
}

Body Attributes

key

string

required

The API key required to authorize the request.

init_audio

string

required

The URL of the audio file to be transcribed.
Supported formats: WAV, MP3, FLAC, OPUS.
Duration limits: minimum 5 seconds, maximum 1 hour.

language

string

default:"en"

The language code of the audio content in ISO 639-1 format. Examples: en (English), es (Spanish), fr (French).

timestamp_level

string

The level of detail for timestamps in the transcription. Options: word, sentence, or null (no timestamps). Default: null.

webhook

string

A URL to receive a POST request once the transcription is complete.

track_id

integer

An ID included in the webhook response to identify the request.

Languages Supported

Whisper supports several languages, but performance may vary due to factors like limited training data, script complexity, and regional dialects, potentially affecting transcription accuracy.

"Afrikaans": "af",
"Arabic": "ar",
"Belarusian": "be",
"Bengali": "bn",
"Bulgarian": "bg",
"Chinese": "zh",
"Czech": "cs",
"Danish": "da",
"Dutch": "nl",
"English": "en",
"Finnish": "fi",
"French": "fr",
"German": "de",
"Greek": "el",
"Hebrew": "he",
"Hindi": "hi",
"Hungarian": "hu",
"Indonesian": "id",
"Italian": "it",
"Japanese": "ja",
"Kannada": "kn",
"Korean": "ko",
"Malayalam": "ml",
"Marathi": "mr",
"Nepali": "ne",
"Panjabi": "pa",
"Persian": "fa",
"Polish": "pl",
"Portuguese": "pt",
"Romanian": "ro",
"Russian": "ru",
"Serbian": "sr",
"Spanish": "es",
"Swedish": "sv",
"Tagalog": "tl",
"Tamil": "ta",
"Telugu": "te",
"Thai": "th",
"Turkish": "tr",
"Ukrainian": "uk",
"Urdu": "ur",
"Vietnamese": "vi",
"Welsh": "cy"

Performance may vary due to factors like script complexity, and regional dialects, which may affect transcription accuracy.

Enterprise APIs

​Request

​Body

​Body Attributes

​Languages Supported

Request

Body

Body Attributes

Languages Supported