Agents: configuring voice agents end to end
How to configure a Call2Me voice agent — system prompts, LLM model, voice, knowledge base, post-call extraction, transfer targets, recording, and webhooks.
Updated May 1, 2026
A voice agent on Call2Me is a single config object that owns: the system prompt, the LLM, the voice, the knowledge base, the rules for when to transfer or extract data, and the phone numbers it answers on. This page walks every section.
System prompt
The system prompt is the rules the agent follows on every call. It's the single highest-leverage thing to spend time on.
Good prompts are:
- Specific to your business. "You are Aysel, the receptionist at Acme Dental in Kadıköy" beats "You are a helpful assistant."
- Short — under 800 words. Long prompts make the LLM slower and more confused.
- Explicit about escalation. Define exactly when to transfer to a human and to whom.
- Style-aware. Match how you'd want a human employee to sound.
A working template:
You are [Name], the [role] at [Business].
Your job:
- [primary task 1]
- [primary task 2]
- Forward urgent issues to [number].
Style:
- Speak naturally, 2-sentence answers max.
- Confirm details back at the end of each request.
- Never say "as an AI" — you're [Name].
Use ONLY the information from the knowledge base when answering
specific questions. If you don't know, say "I'll check and get back
to you" — never guess.
LLM model
openrouter/auto is the default and routes to the best-available model in real time. For specific behaviour:
- Reliability + grounding →
openai/gpt-4ooranthropic/claude-3.5-sonnet. - Lowest latency →
openai/gpt-4o-minioranthropic/claude-3-haiku. - Cost-sensitive bulk outbound →
openai/gpt-4o-mini,meta/llama-3.1-70b.
You can change the model any time on AgentDetail. No re-deploy needed.
Voice
Voice gallery → filter by language. Every voice has a Preview button.
For multilingual agents, pick from the Multilingual filter:
- ElevenLabs Multilingual v2 — most natural across all 9 languages.
- ElevenLabs Turbo Multilingual — faster, lower latency.
- Cartesia Sonic Multilingual — cheapest, surprisingly good in Turkish and Arabic.
- OpenAI TTS Multilingual — solid, included in base voice price.
For single-language agents, the per-language voices have richer prosody.
Knowledge base
Attach a knowledge base so the agent can answer specific questions about your business. Supported source types:
- PDF — menus, FAQs, manuals.
- DOCX, TXT, Markdown — same idea, different format.
- URL crawl — paste a public URL, the crawler indexes the page.
The agent retrieves the most relevant chunks at speech time and grounds its answer in actual content. See Knowledge base docs for the full setup.
Post-call data extraction
Define structured fields the agent should fill in after each call:
- name: customer_name
type: string
prompt: "What was the customer's name?"
- name: interested
type: boolean
prompt: "Did they express interest?"
- name: callback_time
type: string
prompt: "If they asked for a callback, when?"
- name: notes
type: string
prompt: "Anything else worth saving?"
After each call ends, an LLM reads the transcript and fills these in. They appear on the call record as JSON, exportable to CSV.
Transfer targets
Add phone numbers or SIP endpoints the agent can transfer to. The system prompt tells it when to use them:
ESCALATION:
- For urgent medical questions, immediately transfer to +90 532 XXX XX XX.
- For pricing questions outside your knowledge, transfer to +90 532 YYY YY YY.
- For everything else, handle yourself.
You can also transfer to other Call2Me agents (agent-to-agent). Useful for handoffs between language-specific agents or department-specific specialists.
Webhooks
POST every event you care about to your endpoint. Common patterns:
- On every captured booking — webhook fires with the extracted fields.
- On call end — full transcript + extracted data.
- On unhandled escalation — your support team gets pinged immediately.
Configure the webhook URL on AgentDetail. Payload format is JSON.
Recording
Toggle recording on per agent. Cost: +$0.05/minute.
Recordings appear in the Calls tab as inline-playable MP3. They're stored in our buckets — you can pull them via API for archival.
Privacy note: if you turn recording on, disclose it in the agent's opening line ("This call may be recorded for quality"). Some jurisdictions require explicit consent.
Voicemail detection
On by default. When the agent detects voicemail (the typical "leave a message after the beep" pattern), you have three options:
- Hang up — no message left.
- Leave a scripted message — define what the agent says.
- Continue as if it were a human — risky; only use if your script is short.
Configure on AgentDetail under Voicemail.
Welcome message
The first thing the agent says when it picks up. Default is generated from the system prompt; you can override it for a sharper opener:
"Hi, you've reached [Business]. This is [Name] — how can I help?"
A specific opener gets better engagement than "Hello, how may I assist you today?"
DTMF input
Agents can accept keypad input ("Press 1 for sales, 2 for support"). Define the menu in the system prompt; the runtime handles the DTMF parsing.
Useful for: language selection, account number entry, IVR-style routing on top of the voice agent.
Frequently asked
Q.What language models can I use for my voice agent?
All models routed through OpenRouter: GPT-4o, GPT-4o-mini, GPT-4-turbo, GPT-4, GPT-3.5-turbo, o3-mini, Claude 3 Opus/Sonnet/Haiku, Claude 3.5 Sonnet/Haiku, Gemini 1.5 Pro/Flash, Llama 3.1 8B/70B, Mistral 7B, Mixtral 8x7B. Default is openrouter/auto for best-available routing.
Q.What voices are available?
ElevenLabs Multilingual v2 + Turbo, Cartesia Sonic Multilingual, OpenAI TTS Multilingual, plus per-language single-voice options. Filter the voice gallery by language and preview each one.
Q.Can the agent transfer calls to a human?
Yes. Add transfer targets on AgentDetail — phone numbers, SIP endpoints, or other Call2Me agents. The system prompt tells the agent when to transfer.
Q.Does Call2Me record calls?
Optionally. Toggle recording on per agent. Cost is +$0.05/minute. Recordings appear in the Calls tab as inline-playable audio.