Make sure you add your s3 details for voice_cloning server, so you can receive image generated in your bucket. Images generated without s3 details being added will be delete after 24 hours

Request

Make a POST request to below endpoint and pass the required parameters as a request body.
curl
--request POST 'https://modelslab.com/api/v1/enterprise/voice/music_gen' \

Body

json
{    
  "key":"enterprise_api_key",    
  "prompt":"rock music from the 90s",   
  "init_audio":"https://pub-3626123a908346a7a8be8d9295f44e26.r2.dev/generations/2e4c9960-7425-4720-82a1-a373063bf635.wav",    
  "sampling_rate":32000,    
  "base64":false,
  "temp": false,    
  "webhook":null,    
  "track_id":null
}

Body Attributes

key
string
required
Your API Key used for request authorization.
prompt
string
required
The input text for audio generation.
init_audio
string
The conditioning melody for audio generation.
sampling_rate
integer
default:"32000"
The sampling rate of the generated audio. Lower bound: 10000, Default: 32000. No strict upper bound.
max_new_token
integer
default:"512"
The maximum number of new tokens for audio generation. Range: 256–1024.
base64
boolean
default:"false"
Whether the input sound clip is in base64 format. Default: false.
temp
boolean
default:"false"
Whether you want temporary links (useful if your country blocks access to storage sites). Default: false.
output_format
string
default:"wav"
Allows you to specify the output format of the generated audio. Options: wav, mp3, flac. Default: wav.
bitrate
string
default:"128k"
The bitrate of the generated audio file. Higher bitrates improve quality but increase file size. Default: 128k.
webhook
string
Provide a URL to receive a POST API call once the audio generation is complete.
track_id
string
This ID is returned in the response to the webhook API call and will be used to identify the webhook request.