Skip to main content

Text to Audio Endpoint

Overview

Text to audio endpoint allows you to create an audio by passing in the text and a valid audio url or a pre created voice as voice_id. The result produces an audio with the same sound as the audio url or voice that was passed.

Request

--request POST 'https://modelslab.com/api/v6/voice/text_to_audio' \

Watch the Text to Audio API Demo video to see it in action Postman.

Make a POST request to https://modelslab.com/api/v6/voice/text_to_audio endpoint and pass the required parameters as a request body.

Body Attributes

ParameterDescriptionValues
keyYour API Key used for request authorizationstring
promptText prompt describing the audio you want to generatetext
init_audioA valid URL of the audio you want to use for voice cloning.MP3/WAV URL (max 30 seconds)
voice_idOptional. A valid ID from the list of available voices.See list of voices
languageThe language of the voice. Defaults to English.english, arabic, spanish, german, czech, chinese, dutch, french, hindi, hungarian, italian, japanese, korean, polish, russian, turkish
emotionThe desired emotion for the voice. Defaults to neutral.neutral, happy, sad, angry, dull
base64Whether the input sound clip is in base64 format. Defaults to false.TRUE or FALSE
tempWhether you want temporary links, useful if your country blocks access to certain storage sites. Defaults to false.TRUE or FALSE
webhookURL to receive a POST API call once the audio generation is complete.URL
track_idID returned in the response for the webhook API call, used to identify the webhook request.integral value

Note: You can either pass init_audio or voice_id. However, if both are passed at the same time the init_audio takes preference.

Example

Body

Body
{   
"key": "",
"prompt":"Narrative voices capable of pronouncing terminologies & acronyms in training and ai learning materials.",
"init_audio":"https://pub-f3505056e06f40d6990886c8e14102b2.r2.dev/audio/tom_hanks_1.wav",
"language":"english",
"webhook": null,
"track_id": null
}

Request

var myHeaders = new Headers();
myHeaders.append("Content-Type", "application/json");

var raw = JSON.stringify({
"key": "",
"prompt":"Narrative voices capable of pronouncing terminologies & acronyms in training and ai learning materials.",
"init_audio":"https://pub-f3505056e06f40d6990886c8e14102b2.r2.dev/audio/tom_hanks_1.wav",
"language":"english",
"webhook": null,
"track_id": null
});

var requestOptions = {
method: 'POST',
headers: myHeaders,
body: raw,
redirect: 'follow'
};

fetch("https://modelslab.com/api/v6/voice/text_to_audio", requestOptions)
.then(response => response.text())
.then(result => console.log(result))
.catch(error => console.log('error', error));

Response

{
"status": "success",
"generationTime": 1.904285192489624,
"id": 334166,
"output": [
"https://pub-3626123a908346a7a8be8d9295f44e26.r2.dev/generations/b2dff60e-4636-4178-9a72-04a10a309185.wav"
],
"proxy_links": [
"https://cdn2.stablediffusionapi.com/generations/b2dff60e-4636-4178-9a72-04a10a309185.wav"
],
"meta": {
"base64": "no",
"emotion": "Neutral",
"filename": "b2dff60e-4636-4178-9a72-04a10a309185.wav",
"input_sound_clip": [
"tmp/0-b2dff60e-4636-4178-9a72-04a10a309185.wav"
],
"input_text": "Narrative voices capable of pronouncing terminologies & acronyms in training and ai learning materials.",
"language": "english",
"speed": 1,
"temp": "no"
}
}