Skip to main content

AudioGen API

Overview

The AudioGen API provides tools for speech-to-text, text-to-speech, text-to-SFX, and voice-over, and many more.

Available Endpoints

Text To Audio Endpoint

The Text-to-Audio endpoint enables you to generate audio by providing a text input along with a valid audio URL or a pre-created voice using a voice_id. The output is an audio file that mimics the sound of the provided audio URL or the selected voice.

Voice to Voice Endpoint

The Voice-to-Voice endpoint lets you clone a voice from a target audio file.

Music Gen Endpoint

The Music Generation API allows you to generate music based on textual prompts and optional conditioning melodies. This API is ideal for applications in music composition, sound design, and creative audio projects.

Voice Cover Endpoint

The Voice Cover endpoint allows you to transform a song or audio file into a different voice using a provided model. Find all available voice models HERE.

SFX Endpoint

The SFX endpoint allows you to generate sound effects (SFX) from text prompts. It takes user input in the form of a text prompt to conditionally generate audio effects.

Speech to Text Endpoint

Speech-to-Text transforms audio into written transcription, allowing spoken language to be converted into text for various applications.