AudioGen API

Overview

The AudioGen API provides tools for speech-to-text, text-to-speech, text-to-SFX, and voice-over, and many more.

try playground now

Available Endpoints

Text To Audio Endpoint

The Text-to-Audio endpoint enables you to generate audio by providing a text input along with a valid audio URL or a pre-created voice using a voice_id. The output is an audio file that mimics the sound of the provided audio URL or the selected voice.

Voice to Voice Endpoint

The Voice-to-Voice endpoint lets you clone a voice from a target audio file.

Song Generator Endpoint

The Voice-to-Voice endpoint lets you clone a voice from a target audio file.

Music Gen Endpoint

The Music Generation API allows you to generate music based on textual prompts and optional conditioning melodies. This API is ideal for applications in music composition, sound design, and creative audio projects.

Voice Cover Endpoint

The Voice Cover endpoint allows you to transform a song or audio file into a different voice using a provided model. Find all available voice models HERE.

SFX Endpoint

The SFX endpoint allows you to generate sound effects (SFX) from text prompts. It takes user input in the form of a text prompt to conditionally generate audio effects.

Speech to Text Endpoint

Speech-to-Text transforms audio into written transcription, allowing spoken language to be converted into text for various applications.

Overview​

Available Endpoints​

Text To Audio Endpoint​

Voice to Voice Endpoint​

Song Generator Endpoint​

Music Gen Endpoint​

Voice Cover Endpoint​

SFX Endpoint​

Speech to Text Endpoint​