Skip to main content

Text to Audio Endpoint

Overview

The Text-to-Audio endpoint enables you to generate audio by providing a text input along with a valid audio URL or a pre-created voice using a voice_id. The output is an audio file that mimics the sound of the provided audio URL or the selected voice.

Open in Playground 🚀

Sample Generation


Example 1

Prompt

In the ancient land of Eldoria, where the skies were painted with shades of mystic hues and the forests whispered secrets of old, there existed a dragon named Zephyros. Unlike the fearsome tales of dragons that plagued human hearts with terror, Zephyros was a creature of wonder and wisdom, revered by all who knew of his existence.


Generated Output


Request

--request POST 'https://modelslab.com/api/v6/voice/text_to_audio' \

Watch the Text to Audio API Demo video to see it in action Postman.

Make a POST request to https://modelslab.com/api/v6/voice/text_to_audio endpoint and pass the required parameters as a request body.

Body Attributes

ParameterDescriptionValues
keyThe API key used for authenticating your request.String
promptThe text prompt that describes the audio to be generated.Text
init_audioA valid URL pointing to the audio file for voice cloning. The file should be 4 to 30 seconds long.MP3/WAV URL
voice_id(Optional) The ID of a voice from the available list. If provided,the audio will be generated using this voice.See list of voices
languageThe language for the voice. Defaults to English if not specified.english, arabic, spanish, brazilian portugues, german, czech, chinese, dutch, french, hindi, hungarian, italian, japanese, korean, polish, russian, turkish
speedplayback speed of the generated audio. Defaults to 1.0.Integral value
base64Indicates whether the input audio file is provided in base64 format. Defaults to "false".TRUE or FALSE
tempSpecifies if temporary links should be used valid for 24 hours. This can help if access to certain storage sites is blocked. Defaults to "false" .TRUE or FALSE
webhookA URL where the API will send a POST request once the audio generation is complete.URL
track_idAn ID returned in the API response, used to identify webhook requestsIntegral value
Open in Playground 🚀

Note: You can either pass init_audio or voice_id. However, if both are passed at the same time the init_audio takes preference.

Example

Body

Body
{   
"key": "",
"prompt":"Narrative voices capable of pronouncing terminologies & acronyms in training and ai learning materials.",
"init_audio":"https://pub-f3505056e06f40d6990886c8e14102b2.r2.dev/audio/tom_hanks_1.wav",
"language":"english",
"webhook": null,
"track_id": null
}

Request

var myHeaders = new Headers();
myHeaders.append("Content-Type", "application/json");

var raw = JSON.stringify({
"key": "",
"prompt":"Narrative voices capable of pronouncing terminologies & acronyms in training and ai learning materials.",
"init_audio":"https://pub-f3505056e06f40d6990886c8e14102b2.r2.dev/audio/tom_hanks_1.wav",
"language":"english",
"webhook": null,
"track_id": null
});

var requestOptions = {
method: 'POST',
headers: myHeaders,
body: raw,
redirect: 'follow'
};

fetch("https://modelslab.com/api/v6/voice/text_to_audio", requestOptions)
.then(response => response.text())
.then(result => console.log(result))
.catch(error => console.log('error', error));

Response

{
"status": "success",
"generationTime": 1.904285192489624,
"id": 334166,
"output": [
"https://pub-3626123a908346a7a8be8d9295f44e26.r2.dev/generations/b2dff60e-4636-4178-9a72-04a10a309185.wav"
],
"proxy_links": [
"https://cdn2.stablediffusionapi.com/generations/b2dff60e-4636-4178-9a72-04a10a309185.wav"
],
"meta": {
"base64": "no",
"emotion": "Neutral",
"filename": "b2dff60e-4636-4178-9a72-04a10a309185.wav",
"input_sound_clip": [
"tmp/0-b2dff60e-4636-4178-9a72-04a10a309185.wav"
],
"input_text": "Narrative voices capable of pronouncing terminologies & acronyms in training and ai learning materials.",
"language": "english",
"speed": 1,
"temp": "no"
}
}