Skip to main content

Transcription API Documentation

Endpoint: /transcribe

Method: POST
Authentication: API Key (required)

Features

  • Transcribe audio files using Transcription API endpoint.
  • Supports multiple input methods (base64, URL, or call log ID)
  • Automatic speaker identification (AI vs User)
  • Handles credit and unit-based billing
  • Admin mode for enterprise use

Request Format

Base Request

{
  "api_key": "your_api_key",
  "audio_data": "base64_encoded_audio (optional)",
  "audio_url": "https://url-to-audio-file (optional)",
  "call_id": "12345 (optional)",
  "mime_type": "audio/wav (optional, default: audio/wav)"
}
Note: Exactly one of audio_data, audio_url, or call_id must be provided.

Input Methods

Method 1: Base64 Audio Data

Upload audio directly as base64 encoded data.
{
  "api_key": "your_api_key_here",
  "audio_data": "UklGRiQAAABXQVZFZm10IBAAAAABAAEA...",
  "mime_type": "audio/wav"
}
Supported MIME types:
  • audio/wav
  • audio/mp3
  • audio/mp4 (m4a)
  • audio/ogg

Method 2: Audio URL

Provide a direct URL to the audio file.
{
  "api_key": "your_api_key_here",
  "audio_url": "https://example.com/audio/recording.wav"
}
The system automatically downloads and processes the audio file.

Method 3: Log ID

Use an existing call log ID to transcribe a previous call recording.
{
  "api_key": "your_api_key_here",
  "call_id": "123"
}
Requirements:
  • The log must belong to the authenticated user
  • The log must have a valid recording URL

Response Format

Success Response (200 OK)

{
  "transcription": "AI: Hello, this is an AI assistant...\nUser: Hi, yes I can hear you.\n...",
  "user_credits_deducted": 0.0,
  "timestamp": "2025-10-28T12:34:56.789012"
}
Fields:
FieldDescription
transcriptionFull conversation text with speaker tags
user_credits_deductedCredits used for this transcription
timestampTime of transcription completion (ISO format)

Error Responses

400 Bad Request

{"error": "One of audio_data, audio_url, or call_id is required"}
{"error": "Only one of audio_data, audio_url, or call_id should be provided"}
{"error": "Audio must be at least 1 second long"}
{"error": "Audio must not exceed 30 minutes"}

401 Unauthorized

{"error": "Invalid API key"}

402 Payment Required

{"error": "Insufficient credits, recharge at https://playground.aivoco.com"}
{
  "error": "Insufficient admin units for transcription, get now from Vispark Cloud at http://cloud.vispark.in",
  "required_units": 0.1234
}
{
  "error": "Insufficient user credits for transcription, recharge at https://playground.aivoco.com",
  "required_credits": 0.5,
  "current_credits": 0.2
}

404 Not Found

{"error": "Call id not found or does not belong to user"}
{"error": "No recording found for this log"}

429 Too Many Requests

{"error": "Rate limit exceeded. Please try again later."}

500 Internal Server Error

{"error": "Failed to process transcription with Gemini API"}

Constraints

ConstraintLimit
Minimum duration1 second
Maximum duration30 minutes
Rate limit60 requests/min per IP
AuthenticationRequired via API key

Example Usage

cURL Example

curl -X POST https://call.aivoco.on.cloud.vispark.in/transcribe   -H "Content-Type: application/json"   -d '{
    "api_key": "your_api_key_here",
    "audio_url": "https://example.com/recording.wav"
  }'

Python Example

import requests

url = "https://call.aivoco.on.cloud.vispark.in/transcribe"
payload = {"api_key": "your_api_key_here", "call_id": "123"}

response = requests.post(url, json=payload)
data = response.json()

if response.status_code == 200:
    print(f"Transcription:\n{data['transcription']}")
    print(f"Credits used: {data['user_credits_deducted']}")
else:
    print(f"Error: {data['error']}")

JavaScript Example

const url = 'https://call.aivoco.on.cloud.vispark.in/transcribe';
const payload = {
  api_key: 'your_api_key_here',
  audio_data: 'UklGRiQAAABXQVZFZm10...',
  mime_type: 'audio/wav'
};

fetch(url, {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify(payload)
})
.then(res => res.json())
.then(data => {
  if (data.transcription) console.log('Transcription:', data.transcription);
  else console.error('Error:', data.error);
})
.catch(err => console.error('Request failed:', err));

Transcription Format

Example output with speaker labeling:
AI: Hello, this is an AI assistant calling about your recent inquiry.
User: Hi, yes I can hear you.
AI: Great! I'm calling to discuss your application. Do you have a few minutes?
User: Yes, sure. Go ahead.
AI: Wonderful. I wanted to confirm your contact information first.
User: [unclear] what was that?
AI: I said I need to confirm your contact information. Can you verify your email address?
User: Oh yes, it's john@example.com
Highlights:
  • Clear AI: and User: labels
  • Chronological conversation flow
  • Marks unclear sections with [unclear]

Notes

  • Audio files are processed in-memory and not stored
  • Duration estimation accurate for WAV; approximate for others
  • Uses Vispark Vision (small) model for transcription
  • Supports multilingual input


© 2025 AIvoco | Transcription | Speech-to-Text Intelligence