Skip to main content
POST
/
voice
/
song_generator
Generate song from lyrics and reference audio
curl --request POST \
  --url https://modelslab.com/api/v6/voice/song_generator \
  --header 'Content-Type: application/json' \
  --data '
{
  "key": "<string>",
  "init_audio": "<string>",
  "lyrics_generation": true,
  "lyrics": "<string>",
  "prompt": "<string>",
  "duration": 123,
  "language": "ar",
  "instrumental": true,
  "caption": "<string>",
  "webhook": "<string>",
  "track_id": 123
}
'
{
  "status": "success",
  "generationTime": 123,
  "id": 123,
  "output": [
    "<string>"
  ],
  "proxy_links": [
    "<string>"
  ],
  "future_links": [
    "<string>"
  ],
  "links": [
    "<string>"
  ],
  "meta": {},
  "eta": 123,
  "message": "<string>",
  "tip": "<string>",
  "fetch_result": "<string>",
  "audio_time": 123
}
This API uses the ACE-Step v1.5 model for high-quality song generation with vocal synthesis.

Request

Make a POST request to below endpoint and pass the required parameters as a request body.
curl
--request POST 'https://modelslab.com/api/v6/voice/song_generator' \

Body

  • Song duration can be between 30 seconds and 480 seconds (0.5 - 8 minutes)
  • If lyrics are not provided, set lyrics_generation to true and provide a prompt and caption instead
  • Set instrumental to true to generate an instrumental version without vocals
  • 50+ languages are supported (see table below)
Sample request when lyrics_generation is false
json
{       
    "key":"your_api_key",   
    "lyrics_generation":false,    
    "lyrics":"[Intro: Sampled Vocal Loop]
    (Oh-oh-oh-oh-oh-oh-oh-oh)
    (Oh-oh-oh-oh-oh-oh-oh-oh)

    [Chorus]
    Esta noche todo te lo daré
    Es libre ya no me amarraré
    Grita mi nombre, dime que me quieres
    Me pierdo en tus ojos como si fuera nieve
    Esta noche todo te lo daré
    Entre tus brazos me quedaré
    Grita mi nombre, dime que me quieres
    Me pierdo en tus ojos como si fuera nieve

    [Verse 1]
    Tus ojos me hipnotizan, me hacen suspirar
    Tus labios me llaman, no puedo escapar
    Tus manos me tocan, siento la pasión
    Cada latido es una explosión
    Acércate más, no me dejes caer
    Bailamos juntos hasta el amanecer

    [Pre-Chorus]
    Tus labios me tocan, tus ojos me miran
    No puedo escapar, mi alma se inspira

    [Chorus]
    Esta noche todo te lo daré
    Esta noche todo te lo daré
    Es libre ya no me amarraré
    Grita mi nombre, dime que me quieres
    Me pierdo en tus ojos como si fuera nieve
    Esta noche todo te lo daré
    Entre tus brazos me quedaré
    Grita mi nombre, dime que me quieres
    Me pierdo en tus ojos como si fuera nieve

    [Verse 2]
    Tus besos me alzan, me hacen volar
    No hay dudas, no hay miedo, solo quiero amar
    Tu voz me llama, tu cuerpo me arde
    Con cada susurro todo se comparte
    Juntos en el fuego, no hay que temer
    Esta noche nunca lo vas a olvidar

    [Bridge]
    Solo un instante
    Deja que te acerque, ven a mí
    Solo un instante
    Deja que te acerque, ven a mí
    Solo un instante
    Deja que te acerque, ven a mí
    Solo un instante
    Deja que te acerque, ven a mí
    Solo un instante
    Deja que te acerque, ven a mí
    [Chorus]
    Esta noche todo te lo daré
    Es libre ya no me amarraré
    Grita mi nombre, dime que me quieres
    Me pierdo en tus ojos como si fuera nieve
    Esta noche todo te lo daré
    Entre tus brazos me quedaré
    Grita mi nombre, dime que me quieres
    Me pierdo en tus ojos como si fuera nieve
    [Outro]
    Solo una noche más
    Hasta que me desvuelva
    Y de que me quede sin nada",
    "caption":"A modern reggaeton track with a strong flamenco influence, opening with a pitched and chopped vocal sample melody over a classic dembow beat. A clear, confident female vocal enters, singing in Spanish with a touch of reverb. The arrangement is built on a deep sub-bass, crisp drum machine percussion, and a distinctive plucked synth guitar riff that carries the main melodic hook. The chorus elevates the energy with layered vocals and a more intense delivery. The track breaks down into a more atmospheric bridge with filtered, introspective vocals before returning to the full chorus and concluding with a sparse, whispered outro that fades to silence.",
    "duration": 199,
    "webhook":null,
    "track_id": null
}
Sample request when lyrics_generation is true
json
{       
    "key":"your_api_key",   
    "lyrics_generation":true,    
    "prompt":"A polished Cantopop track built on an ethereal foundation of layered female vocal harmonies that create atmospheric pads with wordless 'oohs' and 'aahs'. A crisp electronic drum machine provides a steady mid-tempo beat, complemented by clean synth keys and a subtle bassline. The lead female vocal is clear and melodic, sung entirely in Cantonese, guiding the listener through verses and into an uplifting chorus where her voice becomes more powerful and layered. The arrangement includes brief instrumental interludes featuring chime-like sound effects before concluding with a fade-out of the signature airy vocal textures.",    
    "caption":"female vokal,folkmetal, Electro ,Violin,Piano,extinct,ad’ baseline,jane breakbeat; guitar,Harmonium,bass),child’s voice.",
    "duration":199,
    "webhook":null,     
    "track_id": null
}

Supported Languages

LanguageCode
Arabicar
Azerbaijaniaz
Bulgarianbg
Bengalibn
Catalanca
Czechcs
Danishda
Germande
Greekel
Englishen
Spanishes
Persianfa
Finnishfi
Frenchfr
Hebrewhe
Hindihi
Croatianhr
Haitian Creoleht
Hungarianhu
Indonesianid
Icelandicis
Italianit
Japaneseja
Koreanko
Latinla
Lithuanianlt
Malayms
Nepaline
Dutchnl
Norwegianno
Punjabipa
Polishpl
Portuguesept
Romanianro
Russianru
Sanskritsa
Slovaksk
Serbiansr
Swedishsv
Swahilisw
Tamilta
Telugute
Thaith
Tagalogtl
Turkishtr
Ukrainianuk
Urduur
Vietnamesevi
Cantoneseyue
Chinesezh
Unknownunknown

Body

application/json
key
string
required

API key for authentication

init_audio
string<uri>
required

URL to reference audio file to influence style

lyrics_generation
boolean

Pass true to generate lyrics automatically

lyrics
string

Lyrics in LRC format (timestamp + lyrics). Required if lyrics_generation is false

prompt
string

Topic for lyrics generation. Required if lyrics_generation is true

duration
integer | null

Duration of the generated song in seconds (30-480 seconds / 0.5-8 minutes)

language
enum<string> | null

Language for the generated song

Available options:
ar,
az,
bg,
bn,
ca,
cs,
da,
de,
el,
en,
es,
fa,
fi,
fr,
he,
hi,
hr,
ht,
hu,
id,
is,
it,
ja,
ko,
la,
lt,
ms,
ne,
nl,
no,
pa,
pl,
pt,
ro,
ru,
sa,
sk,
sr,
sv,
sw,
ta,
te,
th,
tl,
tr,
uk,
ur,
vi,
yue,
zh,
instrumental
boolean | null

Whether to generate an instrumental version without vocals

caption
string | null

Caption for the song describe styles, female or male voice or loops and more.

webhook
string<uri>

URL to receive POST notification upon completion

track_id
integer

ID for webhook identification

Response

Song generation response

status
enum<string>

Status of the voice generation

Available options:
success,
processing,
error
generationTime
number

Time taken to generate the audio in seconds

id
integer

Unique identifier for the voice generation

output
string<uri>[]

Array of generated audio URLs

Array of proxy audio URLs

Array of future audio URLs for queued requests

Array of audio URLs (voice cover response)

meta
object

Metadata about the audio generation including all parameters used

eta
integer

Estimated time for completion in seconds (processing status)

message
string

Status message or additional information

tip
string

Additional information or tips for the user

fetch_result
string<uri>

URL to fetch the result when processing

audio_time
number

Duration of the generated audio in seconds