Make sure you add your s3 details for voice_cloning server, so you can receive image generated in your bucket. Images generated without s3 details being added will be delete after 24 hours
voice_id can be found here.

Request

Make a POST request to below endpoint and pass the required parameters as a request body.
curl
--request POST 'https://modelslab.com/api/v1/enterprise/voice/text_to_audio' \
You can either pass init_audio or voice_id. However, if both are passed at the same time the init_audio takes preference.

Body

json
{    
  "key": "enteprise_api_key", 
  "prompt":"Narrative voices capable of pronouncing terminologies & acronyms in training and ai learning materials.", 
  "init_audio":"https://pub-f3505056e06f40d6990886c8e14102b2.r2.dev/audio/tom_hanks_1.wav", 
  "language":"english", 
  "webhook": null, 
  "track_id": null
}

Body Attributes

key
string
required
Your API Key used for request authorization.
prompt
string
required
Text prompt with description of the audio you want to generate.
init_audio
string
A valid audio URL to be voice-cloned. Minimum length: 4 seconds, Maximum length: 30 seconds.
voice_id
string
Optional. ID of voice from available list Find Voice IDs Here.
language
string
default:"english"
The language of the voice. Default: english.
emotion
string
default:"neutral"
The emotional tone of the generated voice. Options: neutral, happy, sad, angry, dull. Default: neutral.
base64
boolean
default:"false"
Whether the input sound clip is in base64 format. Default: false.
temp
boolean
default:"false"
Whether you want temporary links (useful if your country blocks access to storage sites). Default: false.
webhook
string
Provide a URL to receive a POST API call once the audio generation is complete.
track_id
string
This ID is returned in the response to the webhook API call and will be used to identify the request.