Call2Me
Channels

LiveKit & WebRTC: voice calls in the browser

Run a voice agent inside any web app — no phone number needed. Get a LiveKit token, embed the SDK, talk to the agent from a browser tab in under 200ms latency.

Updated May 6, 2026

Browser voice is the third channel after phone and chat. It's a WebRTC session — your user clicks a button on your site, their browser talks to the agent in real time, no phone number, no telephony charges.

How it works

  1. Your backend asks Call2Me for a LiveKit token, scoped to one user and one agent
  2. Your frontend uses the LiveKit JS SDK to join the room
  3. The agent joins the same room
  4. Both sides speak; the platform handles STT/LLM/TTS in between

The token controls who can join and what room they land in.

Getting a token

curl -X POST https://api.call2me.app/v1/livekit/token \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "agent_id": "agent_abc123",
    "user_identity": "user_4521",
    "metadata": { "user_email": "alice@example.com" }
  }'

Returns:

{
  "url": "wss://livekit.call2me.app",
  "token": "eyJ...",
  "room_name": "agent_abc123_user_4521_session_xyz"
}

Hand url and token to your frontend. Don't expose your Call2Me API key to the browser — issue tokens from your backend.

Frontend embed

<script type="module">
import { Room, RoomEvent } from 'https://esm.sh/livekit-client@latest';

const r = await fetch('/your-backend/livekit-token', { method: 'POST' });
const { url, token } = await r.json();

const room = new Room();
room.on(RoomEvent.TrackSubscribed, (track) => {
  if (track.kind === 'audio') track.attach(document.getElementById('agent-audio'));
});

await room.connect(url, token);
await room.localParticipant.setMicrophoneEnabled(true);
</script>

<audio id="agent-audio" autoplay></audio>
<button onclick="endCall()">End</button>

That's the whole client. The SDK handles getUserMedia, the WebRTC handshake, the audio routing.

Token lifetime

Tokens are short-lived — typically valid for 60 minutes from issue. For long sessions, your backend should reissue when the token nears expiry. The SDK fires a disconnected event with reason token-expired; your code reissues and reconnects.

For one-shot demos (like the "call me from my browser" button on a marketing page), 60 minutes is more than enough.

Receiving the transcript

LiveKit publishes transcripts as data messages on the same room:

room.on(RoomEvent.DataReceived, (payload, participant) => {
  const msg = JSON.parse(new TextDecoder().decode(payload));
  if (msg.type === 'transcript') {
    console.log(`${msg.speaker}: ${msg.text}`);
  }
});

This is how the dashboard's live-transcript view works. Use the same pattern in your own UI.

Comparing channels

PhoneBrowser (LiveKit)Chat
TransportSIP / telephonyWebRTCHTTPS
AudioYesYesNo (text only)
Phone numberRequiredNoneNone
Telephony costYesNoneNone
Voice base costPer minutePer minuteN/A (chat is per message)
Latency~500ms typical~200ms typicalPer message

Browser is faster because there's no telephony hop. Phone is universal. Chat is cheapest. Pick per use case, run them side by side.

What's next

  • Voice — STT/TTS/latency that applies to both phone and browser
  • Agents — agent definitions are channel-agnostic
  • Chats — the text variant

Frequently asked

Q.What's LiveKit and why does Call2Me use it?

LiveKit is the WebRTC infrastructure the platform uses for browser-based voice. It handles the audio session, NAT traversal, and the realtime transport between the browser and the agent. You don't operate it; we host it. You consume tokens.

Q.Do I need a phone number for browser voice?

No. Browser voice is a WebRTC session, not a phone call — no telephony, no SIP, no carrier. You only need an agent and a LiveKit token.

Q.How does this differ from a phone call?

Same agent, same prompt, same knowledge base. Different transport. Browser is faster (no telephony hop), free of telephony charges, and gives you UI control. Phone is universal — anyone with a phone number can reach it.

Q.Can I run both side by side?

Yes. The same agent serves browser sessions, phone calls, and chats simultaneously. Pick the channel per use case.

ShareX / TwitterLinkedIn

Ready to ship?

Spin up your first agent in 5 minutes — $10 free credit.

Start free