API Reference
All endpoints for the Chamade API. Authenticate with your API key via the X-API-Key header.
Authentication
All API calls require the X-API-Key header with your API key:
API keys are created in your Dashboard. They start with chmd_ and are shown only once at creation. Store them securely.
Create Call
POST/api/call
Join a meeting or initiate a call. Returns the call ID and initial state.
Request body:
| Field | Type | Required | Description |
|---|---|---|---|
platform | string | Yes | One of: discord, teams, meet, zoom, telegram, slack, nctalk, sip, whatsapp |
meeting_url | string | Depends | Meeting URL or SIP URI. Required for most platforms. |
agent_name | string | No | Display name for the agent in the meeting. Default: "AI Agent" |
transcripts | bool | No | Opt in to hosted server-side STT (emits call_transcript events, currently beta-gated — likely to fail outside the supervised beta program). Default false. In BYO audio mode (default), Chamade runs zero STT; your agent runs its own STT client against the raw PCM stream from the call WebSocket. |
Response:
Get Call Status
GET/api/call/{call_id}?since=0
Get the current state of a call, including transcript lines. The since parameter enables a delta pattern: only transcript lines after that index are returned, so you can poll efficiently without re-reading the entire transcript.
Response:
On the first call, use since=0 to get all transcript lines. Then pass since={transcript_length} from the response to get only new lines on subsequent polls.
Speak (hosted TTS) — [BETA, gated]
POST/api/call/{call_id}/say
Hosted TTS is in active development and likely to fail on accounts outside the supervised beta program. For production, prefer BYO audio: connect your own TTS (OpenAI Realtime, ElevenLabs, Cartesia, Deepgram, etc.) to the call's audio WebSocket and stream PCM directly. Contact [email protected] to request beta access to hosted TTS.
Speak text aloud in the meeting via Chamade's hosted text-to-speech engine. Only meaningful on platforms with the audio_out capability.
Request body:
Send Chat
POST/api/call/{call_id}/chat
Send a text chat message in the meeting. Works on platforms with the write capability.
Request body:
| Field | Type | Required | Description |
|---|---|---|---|
text | string | Yes | Message text (1–10,000 characters). |
sender_name | string | No | Display name for the sender in meeting chat cards. Default: empty. |
Accept Inbound Call
POST/api/call/{call_id}/accept
Accept a ringing inbound call (SIP, etc.). Changes the call state from "ringing" to "active".
Refuse Inbound Call
POST/api/call/{call_id}/refuse
Refuse/reject a ringing inbound call. Changes the call state from "ringing" to "refused".
Hang Up
DELETE/api/call/{call_id}
End the call and leave the meeting. The call state changes to "ended".
List Active Calls
GET/api/calls
Returns a list of all active calls for the authenticated user.
Typing Indicator
POST/api/call/{call_id}/typing
Send a typing indicator in the meeting chat. Supported on platforms with the typing capability.
WebSocket Stream
WS/api/call/{call_id}/stream?api_key=chmd_...
Real-time bidirectional stream for receiving transcripts and chat, and sending speech and chat messages. This is an alternative to polling GET /api/call/{id} for real-time applications.
Incoming messages (server → client):
Outgoing messages (client → server):
Direct Audio Streaming
The same WebSocket also supports sending and receiving raw audio, bypassing Chamade's TTS/STT. This is useful for agents that bring their own voice models — ElevenLabs streaming, OpenAI Realtime, Cartesia Sonic, Deepgram, etc.
Direct audio streaming is open to every plan. It's a true passthrough — Chamade does not run TTS or STT when you use it. Pass transcripts: false in the POST /api/call body to opt out of server-side STT entirely (recommended when you bring your own STT). With both transcripts: false and direct audio injection, Chamade pays nothing per minute — you can run real-time voice for free, just bring your own voice models.
The audio object in the POST /api/call response tells you the native sample rate of the room. Sending audio at that rate avoids resampling and gives the best quality.
| Platform | Native rate |
|---|---|
| Discord, Meet, NC Talk | 48,000 Hz |
| Teams, Zoom, Telegram | 16,000 Hz |
| SIP | 8,000 Hz |
Format: raw PCM, signed 16-bit little-endian, mono, at the rate above (or your declared rate via audio_config). Internal frame cadence is 20 ms.
Option A — Binary frames (recommended): Send raw PCM s16le mono as binary WebSocket frames. Zero overhead, no JSON encoding. Each frame may contain any number of samples (Chamade buffers and re-aligns to 20 ms internally). Hard cap: 1 MB per frame (~10 s of audio at 48 kHz).
Option B — Base64 JSON: Compatible with OpenAI Realtime and similar JSON-based streaming patterns. Slightly higher overhead due to base64 encoding + JSON parsing.
Audio config (optional): Send this to discover the native sample rate, declare your own rate (Chamade will resample), or opt into receiving the meeting’s audio mix.
Send binary frames at the native sample rate (from the audio field in the call response). No config message needed, no resampling, lowest latency.
Receiving the meeting audio (mix-minus)
If you want your own STT (Whisper, Deepgram, etc.) instead of Chamade's, opt into receiving the meeting audio with audio_config + "receive": true.
- You receive raw PCM s16le mono as binary WebSocket frames, at your declared
sample_rate(Chamade resamples for you), at a 20 ms cadence (~50 frames/second). - It's mix-minus: the audio you injected via Option A/B is excluded from the return stream — you don't hear yourself echo. Only the other participants.
- The text WebSocket continues to deliver
call_transcript,call_chat, andcall_stateevents as JSON text frames in parallel. So you can mix Chamade's STT and your own.
Handling disconnects
The bridge between Chamade and the meeting platform can drop (Maquisard restart, network blip, platform-side disconnect). When that happens, all subscribed agent WebSockets receive:
To recover, call POST /api/call again with the same meeting_url. A new call_id is issued and a new WebSocket can be opened. Previous transcript lines are not carried over — the room is fresh.
Python example
Minimal client that opens the WebSocket, declares 24 kHz, sends a chunk of PCM (your TTS output), and receives the meeting audio for a custom STT pipeline.
The @chamade/mcp-server npm package exposes text-only tools (chamade_call_say, chamade_call_chat, etc.) and a transcript resource. Direct audio I/O is not exposed via MCP — the stdio JSON-RPC transport doesn't reasonably support raw binary streaming. To stream your own PCM, your agent must connect to this WebSocket directly (alongside or instead of the MCP server).
Inbox API (DMs)
The Inbox API lets your agent read and respond to direct messages from platforms like Discord, Telegram, WhatsApp, Teams, and Slack.
List Conversations
GET/api/inbox?platform=discord&limit=50
Returns a list of DM conversations. Filter by platform and limit results.
| Param | Type | Required | Description |
|---|---|---|---|
platform | string | No | Filter by platform (discord, telegram, teams, whatsapp, slack, nctalk) |
limit | int | No | Max conversations to return (default 50) |
status | string | No | Filter by status (default “active”) |
offset | int | No | Pagination offset (default 0) |
Get Conversation
GET/api/inbox/{conversation_id}?limit=50
Get messages from a specific conversation. Use limit to control how many messages are returned.
Send Message (by platform)
POST/api/dm/chat
Send a message to the active DM conversation on a platform. This is the recommended endpoint for agents.
Request body:
Typing Indicator (by platform)
POST/api/dm/typing
Send a typing indicator on a DM conversation.
Request body:
When you call POST /api/dm/typing, the typing indicator stays active automatically (Chamade repeats it internally). It stops when you send a message via POST /api/dm/chat, or after a 65-second safety timeout.
Reply Timeout (60 seconds)
After a user sends a DM, you have 60 seconds to reply via POST /api/dm/chat. If no reply is sent in time, the user sees a timeout error message.
If your task takes longer than a few seconds:
- Send a short acknowledgment immediately via
POST /api/dm/chatwith"keep_typing": true(e.g. “Looking into it...”). Thekeep_typingflag re-activates the typing indicator automatically after sending. - Do your work, then send the full answer via
POST /api/dm/chat(withoutkeep_typing) - If your work takes more than 60 seconds, call
POST /api/dm/typingevery ~55 seconds to keep the indicator alive
Account Status
GET/api/account
Bootstrap call. Returns plan, a features block (global gateway availability — audio_in, audio_out, text_chat, typing_indicators, files are ready in early access; hosted_stt and hosted_tts are beta_gated), platforms as a dict of {status, capabilities} per platform, concurrent_calls, the identities map, and a last_message_cursor to bootstrap chamade_inbox delta mode.
