Speech-to-Text allow to convert audio into written transcription in multiple languages.
POST request to below endpoint and pass the required parameters as a request body.
API key required to authorize the request
URL of audio file to transcribe. Supported: WAV, MP3, FLAC, OPUS (5 seconds - 1 hour)
Language code in ISO 639-1 format (e.g. 'en', 'es', 'fr')
Level of detail for timestamps in transcription
word, sentence, URL to receive POST notification upon completion
ID for webhook identification
Speech to text response
Status of the voice generation
success, processing, error Time taken to generate the audio in seconds
Unique identifier for the voice generation
Array of generated audio URLs
Array of proxy audio URLs
Array of future audio URLs for queued requests
Array of audio URLs (voice cover response)
Metadata about the audio generation including all parameters used
Estimated time for completion in seconds (processing status)
Status message or additional information
Additional information or tips for the user
URL to fetch the result when processing
Duration of the generated audio in seconds