Create Transcription
Create Transcription
POST /v1/audio/transcriptions
Transcribes audio into text. Supports multiple models and response formats.
Request
Content-Type: multipart/form-data
| Parameter | Type | Required | Description |
|---|---|---|---|
file | file | Yes | Audio file (max 25MB). Formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, webm |
model | string | Yes | Model ID: whisper-1, gpt-4o-transcribe, gpt-4o-mini-transcribe, gpt-4o-transcribe-diarize, groq/whisper-large-v3, groq/whisper-large-v3-turbo |
language | string | No | ISO-639-1 language code (e.g., en, es) |
response_format | string | No | json (default), text, srt, verbose_json, vtt |
temperature | number | No | 0 to 1. Default 0. |
prompt | string | No | Guide transcription style |
timestamp_granularities[] | string[] | No | Array of timestamp granularities: segment, word. Only available with verbose_json response format. |
stream | string | No | true for streaming SSE response. Only gpt-4o-transcribe and gpt-4o-mini-transcribe. |
Example
curl -X POST "https://api.osmapi.com/v1/audio/transcriptions" \
-H "Authorization: Bearer $OSM_API_KEY" \
-F file=@audio.mp3 \
-F model=whisper-1Response
{
"text": "Hello, this is a transcription test."
}Response Headers
| Header | Description |
|---|---|
x-request-id | Unique request identifier |
x-osm-response-cost | Request cost in USD |
How is this guide?