Welcome to AIVOCO’s Speech-to-Speech API

Meet ROSE, AIVOCO’s real-time speech-to-speech model that brings voice AI agents to life.
Whether you’re building telecalling agents, AI receptionists, or human-like sales agents, ROSE lets your applications listen, understand, and talk back, all through a natural, continuous speech pipeline. Our APIs and WebSocket infrastructure make it simple to:

Create and manage voice agents
Connect them to telephony systems or web apps
Enable real-time bi-directional audio streaming
Transcribe and analyze your calls, instantly.

Quick Start

Get started in just a few minutes with our three-step process.

Start here

Follow our Quickstart Guide to create your first Speech-to-Speech agent.

What You Can Build

With AIVOCO’s Speech-to-Speech platform, you can bring real-time conversational AI to your business or app.

Voice AI Sales Agents

Build voice-based agents that can make outbound calls, qualify leads, and close deals autonomously.

Telecalling Agents

Automate customer support or follow-up calls with natural-sounding speech-to-speech interactions.

In-App Voice Assistants

Add human-like conversational capabilities inside your product or website with our WebSocket API.

Real-Time Transcription

Get instant transcripts and conversation analytics from live or recorded calls.

How It Works

The AIVOCO Speech-to-Speech flow is designed to be modular and real-time:

Create an Agent — Define your agent’s name, voice, and system behavior via the /agents API.
Connect via WebSocket — Use our secure endpoint to establish a live speech stream between your app or telephony system and ROSE.
Integrate Telephony — Connect with Twilio, Exotel, or any SIP trunking provider using our wss://call.aivoco.on.cloud.vispark.in/ws/{api_key}/{agent_id} WebSocket.
Transcribe Conversations — Retrieve full transcripts and insights from the Transcription endpoint using your call ID.

These steps are covered in detail in the Quickstart Guide →

Authentication

Every API request requires authentication via an API key. To generate your key:

Log in to your AIVOCO Playground
Go to your dashboard → API Keys
Copy your key and include it in the X-API-Key header or WebSocket path

Example:

X-API-Key: YOUR_API_KEY

Supported Integrations

AIVOCO supports multiple channels for connecting your agents:

Telephony (Twilio, Exotel, Telnyx)

Connect your ROSE agent to real phone calls using standard telephony providers.

Web Applications

Stream voice interactions directly in your browser or app using WebSocket APIs.

Custom Integrations

Extend your agents with function calling and real-world data (e.g., weather, CRM updates).

Analytics and Transcription

Convert any voice interaction into structured data using our transcription service.

Need Help?

If your telephony provider or integration type isn’t available, we’re happy to help. 📧 Contact us: vansh@aivoco.com

Getting started

Models

ROSE

Telephony

Model Benchmarks

Introduction

Welcome to AIVOCO’s Speech-to-Speech API

Quick Start

Start here

What You Can Build

Voice AI Sales Agents

Telecalling Agents

In-App Voice Assistants

Real-Time Transcription

How It Works

Authentication

Supported Integrations

Telephony (Twilio, Exotel, Telnyx)

Web Applications

Custom Integrations

Analytics and Transcription

Need Help?

Getting started

Models

ROSE

Telephony

Model Benchmarks

​Welcome to AIVOCO’s Speech-to-Speech API

​Quick Start

Start here

​What You Can Build

Voice AI Sales Agents

Telecalling Agents

In-App Voice Assistants

Real-Time Transcription

​How It Works

​Authentication

​Supported Integrations

Telephony (Twilio, Exotel, Telnyx)

Web Applications

Custom Integrations

Analytics and Transcription

​Need Help?

Welcome to AIVOCO’s Speech-to-Speech API

Quick Start

What You Can Build

How It Works

Authentication

Supported Integrations

Need Help?