Call2Me
All posts
Engineering

Connect a SIP trunk to an AI voice agent: the practical guide

How to wire a SIP trunk into an AI voice agent — BYOC setup, codec and NAT gotchas, inbound vs outbound routing, and why the trunk-to-agent hop is where most self-hosted setups break.

CTCall2Me Team
May 24, 20266 min read
Diagram of a SIP trunk routing phone calls into an AI voice agent

A SIP trunk is just a pipe for phone calls. An AI voice agent is software that talks. The interesting — and fragile — part is the hop between them: getting a real phone call off the carrier network and into a low-latency agent, then back out, without dropping audio or adding a second of delay.

This guide is about that hop. If you've ever had a call connect with dead air on one side, or watched an INVITE retransmit forever, this is for you.

The short version

If you don't already run a carrier relationship, skip the trunk entirely — buy a number inside Call2Me and the telephony leg is managed for you, live in minutes. Bring your own SIP trunk (BYOC) only when you have a reason to own the carrier side.

Start free →

What a SIP trunk actually does

A SIP trunk connects your phone numbers to the internet using SIP (Session Initiation Protocol) for call setup and RTP (Real-time Transport Protocol) for the audio itself. Two separate things travel on a call:

  1. Signaling (SIP). "Here's an incoming call, here's who's calling, here's where to send the audio." INVITE, 200 OK, ACK, BYE.
  2. Media (RTP). The actual voice packets, 50 times a second, in each direction.

Most "my AI agent won't talk" problems are media problems wearing a signaling costume. The call connects (signaling works) but nobody hears anybody (media doesn't). Keep these two layers separate in your head and debugging gets much easier.

Two ways to get a number onto an agent

Managed number (the default)

You buy a number inside the platform. The trunk, the carrier relationship, the SIP credentials — all handled. The number is already wired to route inbound calls to your agent, and outbound calls originate through it. This is the right choice for ~90% of teams. Nothing in this post is required.

Bring Your Own Carrier (BYOC)

You already have a SIP trunk from Telnyx, Twilio, a regional carrier, or your own PBX, and you want the AI agent to answer calls on those numbers. You point the trunk at the platform's SIP endpoint and the agent picks up. This is where the gotchas below live.

When BYOC is worth it
  • You hold number ranges (DIDs) that customers already know.
  • You have negotiated per-minute rates you want to keep.
  • Compliance or data-residency rules dictate a specific carrier.
  • You're fronting an existing PBX and only routing some calls to AI.

The trunk-to-agent hop: where it breaks

Here are the failures, in rough order of how often they bite, and the fix for each.

1. One-way or no audio (NAT + RTP)

The classic. Signaling completes, the call "connects," and then silence — or audio in only one direction.

What's happening: the SDP (the part of SIP that says "send my audio here") advertises an address the other side can't actually reach, or your ACK never makes it back to the carrier so no media path is confirmed. RTP packets go into a void.

The fixes:

  • Symmetric RTP. Send media back to the IP:port the packets actually arrive from, instead of trusting the address in the SDP. This is the single most important setting for trunk interop.
  • Pin your public IP. If your media server sits behind 1:1 NAT, tell it its public IP explicitly for SDP and Contact headers rather than auto-detecting (which often grabs the internal address).
  • Latch onto the first inbound packet. Some carriers advertise an SDP address that's a load balancer, not the real media source. Latching means you ignore the advertised address and lock onto wherever the first real RTP packet came from.

2. INVITE retransmits / no ACK

You see the carrier resending the INVITE every few seconds. That means your 200 OK isn't being acknowledged on a path the carrier accepts — usually because the in-dialog ACK is being routed to the carrier's per-call Contact IP instead of the negotiated path.

The fix: force Record-Route so in-dialog requests (your ACK, your BYE) follow the same path the call was set up on. Carriers with multi-IP, load-balanced front ends (common with Asterisk-based trunks) need this.

3. Codec mismatch

The agent offers Opus, the trunk only speaks G.711, and now there's a transcoder in the path adding latency — or the negotiation fails outright.

The fix: pin to PCMU/PCMA (G.711 µ-law/A-law) plus telephone-event for DTMF. G.711 is the lingua franca of the PSTN. Matching it end-to-end skips transcoding and protects your latency budget.

4. DTMF that doesn't register

The caller presses 1, the agent hears nothing. DTMF over SIP has three encodings (in-band audio, RFC 2833 / telephone-event, and SIP INFO) and they don't all interoperate. Standardize on RFC 2833 (telephone-event) and make sure it's in your codec list. If your agent does menu navigation or PIN entry, test this explicitly — it's silently broken more often than you'd think.

Inbound vs outbound routing

A trunk can carry both, but they're different flows:

InboundOutbound
Who starts the callCallerAgent / campaign
Trunk's jobDeliver call to the agent by dialed numberOriginate with a caller ID you own
Main riskAudio path (the NAT issues above)Spam-flagging, caller-ID reputation
Don't share numbers between inbound and outbound

Use a dedicated number for outbound campaigns. If your outbound dials get spam-flagged, you don't want that reputation bleeding into the inbound business line customers call. Separate numbers, separate reputations.

For outbound at volume, the trunk is only half the story — concurrency, retry logic, and per-call data extraction matter just as much. We cover that side in bulk outbound voice campaigns.

The latency angle

Every hop you add between the trunk and the agent costs milliseconds, and voice AI lives or dies on the sub-500ms budget. Two trunk-side decisions that protect it:

  • No transcoding. Match codecs (G.711 end-to-end) so there's no transcode step in the media path.
  • Regional proximity. A trunk terminating in a region far from your media server adds round-trip time to every packet. Keep them close.

The full breakdown of where every millisecond goes is in our sub-500ms latency deep dive.

How this looks on Call2Me

  • Managed: buy a number, it's wired to your agent. Done.
  • BYOC: point your SIP trunk at the platform endpoint. Symmetric RTP, public-IP pinning, Record-Route, and G.711 codec pinning are handled on the platform side — the interop settings in this post are the defaults, not a configuration project you have to get right yourself.

Either way, the agent on the other end is the same: sub-500ms response, 18 LLM models, knowledge base, transcripts, and post-call data extraction. The trunk is just how the call arrives.

Connect your trunk or grab a number

Whether you're bringing a carrier or starting fresh, the agent is live in minutes. $5 in free credits, no card required.

Start free → · Read the BYOC docs

Frequently asked

Q.Do I need my own SIP trunk to run an AI voice agent?

No. Most teams start with a number bought inside the platform — the trunk is managed for you. You bring your own SIP trunk (BYOC) when you already have a carrier relationship, need specific number ranges, or want to keep an existing business line. Both paths reach the same agent; BYOC just changes who owns the carrier leg.

Q.Why does my SIP call connect but the audio is one-way or silent?

Almost always NAT and RTP. The SIP signaling negotiates fine, but the media (RTP) packets get sent to an unreachable address from the SDP, or the ACK never reaches the carrier so no media path is established. The fixes are symmetric RTP (send media back to the address packets actually arrive from) and pinning the public IP for SDP/Contact rather than trusting the advertised one.

Q.Which codecs should the trunk use for an AI agent?

Pin to PCMU/PCMA (G.711) plus telephone-event for DTMF. G.711 is universally supported, low-latency, and avoids the transcoding penalty that hurts the sub-second response budget. Opus is great for WebRTC legs but most PSTN trunks hand you G.711 anyway, so matching it end-to-end keeps latency down.

Q.Can one trunk handle both inbound and outbound AI calls?

Yes, but route them deliberately. Inbound: the trunk delivers the call to the agent based on the dialed number. Outbound: the agent originates through the trunk using a caller ID you control. Use a dedicated number for outbound campaigns so spam-flagging on outbound dials doesn't hurt the deliverability of your inbound business line.

ShareX / TwitterLinkedIn

Try Call2Me free

Spin up a voice agent in 5 minutes. No credit card required.

Start free trial