Skip to main content

Voice Cover Endpoint

Overview

The Voice Cover endpoint allows you to transform a song or audio file into a different voice using a provided model. Find all available voice models HERE.

Open in Playground 🚀

Sample Music Generation Output


Input Audio

A YouTube video or music link provided for processing:


Generated Output

  • Voice Model ID: arianagrande
  • Processed Music Output:

Request

--request POST 'https://modelslab.com/api/v6/voice/voice_cover' \

Watch the Voice Cloning API Demo video to see it in action Postman.

Make a POST request to https://modelslab.com/api/v6/voice/voice_cover endpoint and pass the required parameters as a request body.

Body Attributes

ParameterDescriptionValues
keyThe API key used to authorize the request.String
init_audioA URL (YouTube links supported) or valid .wav file in base64 format for the audio to be cloned.MP3/WAV URL or base64 data
model_idThe ID of the voice cloning model. Get the model ID from the provided source.String
pitchControls the pitch transformation between voices."m2f", "f2m" or "none"
algorithmThe algorithm used for voice cloning. Defaults to rmvpe."rmvpe" or "mangio-crepe"
rateControls the generated voice's resemblance to the training data.Floating point, between 0 to 1
seedThe seed value to reproduce results. Use null for a random value.Integral value
languageThe language for the voice. arabic, brazilian portuguese, chinese, dutch, french, hindi, hungarian, italian, japanese, korean, polish, russian, turkish. Default is english.String
emotionEmotion of the voice. Defaults to neutral.One of ["neutral", "happy", "sad", "angry", "dull"]
speedThe playback speed of the speaker. Defaults to 1.0.Float (0.5x to 2x)
radiusMedian filtering length to reduce voice artifacts. Defaults to 3.Float (0 to 3)
mixControls the loudness similarity to the original audio. Defaults to 0.25.Float (0 to 1)
hop_lengthThe frequency of pitch analysis. Used with the mangio-crepe algorithm.Integral value
originalityControls similarity to the original vocals' voiceless consonants. Defaults to 0.33Float (0 to 1)
lead_voice_volume_deltaAdjusts the volume of lead vocals.Integer (-5 to +5)
backup_voice_volume_deltaAdjusts the volume of backup vocals.Integer (-5 to +5)
instrument_volume_deltaAdjusts the volume of instrumental tracks.Integer (-5 to +5)
reverb_sizeSpecifies the size of the reverb room. Defaults to 0.15.Float (0 to 1)
wetnessThe reverb applied to generated vocals. Defaults to 0.2.Float (0 to 1)
drynessThe reverb applied to original vocals. Defaults to 0.8.Float (0 to 1)
dampingThe damping factor for high frequencies in the reverb. Defaults to 0.7.Float (0 to 1)
base64Indicates if the input sound clip is in base64 format. Defaults to false.TRUE or FALSE
tempSpecifies if temporary links should be used valid for 24 hours. This can help if access to certain storage sites is blocked. Defaults to "false" .TRUE or FALSE
webhookURL to receive a POST API call once the audio generation is complete.URL
track_idAn ID returned in the API response, used to identify the webhook request.Integral value
Open in Playground 🚀

Example

Body

Body
{   
"key": "",
"init_audio": "https://music.youtube.com/watch?v=aZ1hziFhj1o",
"model_id": "zoro",
"pitch": "none",
"rate": 0.5,
"radius": 3,
"mix": 0.25,
"algorithm": "rmvpe",
"hop_length": 128,
"originality": 0.5,
"lead_voice_volume_delta": "+1",
"backup_voice_volume_delta": "-2",
"instrument_volume_delta":"+2",
"reverb_size": 0.15,
"wetness": 0.2,
"dryness": 0.8,
"damping": 0.7,
"base64": false,
"temp": false,
"webhook": null,
"track_id" : null
}

Request

var myHeaders = new Headers();
myHeaders.append("Content-Type", "application/json");

var raw = JSON.stringify({
"key": "",
"init_audio": "https://music.youtube.com/watch?v=aZ1hziFhj1o",
"model_id": "zoro",
"pitch": "none",
"rate": 0.5,
"radius": 3,
"mix": 0.25,
"algorithm": "rmvpe",
"hop_length": 128,
"originality": 0.5,
"lead_voice_volume_delta": "+1",
"backup_voice_volume_delta": "-2",
"instrument_volume_delta":"+2",
"reverb_size": 0.15,
"wetness": 0.2,
"dryness": 0.8,
"damping": 0.7,
"base64": false,
"temp": false,
"webhook": null,
"track_id" : null
});

var requestOptions = {
method: 'POST',
headers: myHeaders,
body: raw,
redirect: 'follow'
};

fetch("https://modelslab.com/api/v6/voice/voice_cover", requestOptions)
.then(response => response.text())
.then(result => console.log(result))
.catch(error => console.log('error', error));

Response

{
"generationTime": 1.5732920169830322,
"id": 10,
"links": [
"https://cdn2.stablediffusionapi.com/generations/bc1e5025-b140-4af6-be24-183fa18c943a.wav"
],
"proxy_links": [
"https://cdn2.stablediffusionapi.com/generations/bc1e5025-b140-4af6-be24-183fa18c943a.wav"
],
"meta": {
"algorithm": "rmvpe",
"backup_voice_volume_delta": -2,
"base64": "no",
"damping": 0.7,
"dryness": 0.8,
"filename": "bc1e5025-b140-4af6-be24-183fa18c943a.wav",
"hop_length": 128,
"input_sound_clip": "https://music.youtube.com/watch?v=aZ1hziFhj1o",
"instrument_volume_delta": 2,
"is_youtube": true,
"lead_voice_volume_delta": 1,
"mix": 0.25,
"model_id": "zoro",
"originality": 0.5,
"pitch": "none",
"radius": 3,
"rate": 0.5,
"reverb_size": 0.15,
"seed": 1216247535,
"temp": "no",
"wetness": 0.2
},
"status": "success",
}