Tenant Based SIP Routing: Serving Multiple Customers from a Single Deployment

You have one server. You have fifty customers. Every customer wants their own phone numbers, their own webhooks, their own isolated traffic. Here is how Sayna routes SIP calls to the right tenant without duplicating infrastructure.

@tigranbs

December 1, 2025

15 min read

Technicalvoice-aisipmulti-tenantroutinginfrastructuresayna-aisysadmin

I still remember the exact moment when I realized we had a scaling problem that had nothing to do with servers or CPU or memory.

We were running voice AI for three customers. Three. Each one had their own deployment—their own Kubernetes cluster, their own database, their own monitoring stack. It felt manageable at first, even sensible. Isolation is good, right?

Then customer number four signed up. I found myself copying Terraform files again, renaming variables, spinning up yet another isolated environment. Somewhere around the third copy-paste, it hit me: we were solving the wrong problem entirely.

The infrastructure was scaling linearly with customers. That is not a business. That is a recipe for burnout.

The Problem Nobody Warns You About

Here is something the telephony world does not tell newcomers, something I wish someone had told me earlier: SIP was designed for telephone companies, not for SaaS platforms.

Think about it. When AT&T routes a call, they own the entire path from start to finish. One company, one infrastructure, millions of subscribers feeding into it. The protocol was built with this assumption baked in—you are the telephone company.

But we are not telephone companies. We are platform builders. We have dozens of customers, sometimes hundreds, all sharing the same infrastructure. Each customer expects their calls to stay isolated, their data to remain private, their webhooks to receive only their events. That is the promise of multi-tenancy.

Here is the problem: when a SIP call arrives at your server, it carries no tenant identifier. No customer ID field. No routing hint that says "this call belongs to Company X." You get a phone number and some headers. That is it.

So how do you figure out which customer should receive this call? How do you route it to the right webhook, the right AI assistant, the right business logic?

This question kept me up more nights than I want to admit.

The Naive Solution (And Why It Breaks)

Like most teams, we started with phone number lookups. It seemed obvious. Every call comes in, you check the destination number against a database, find the matching customer, route accordingly. Simple.

This works beautifully—until it does not.

The first crack appeared when a customer wanted to bring their own DIDs. Suddenly our database needed to stay perfectly synchronized with their carrier configuration. Miss an update, even by a few minutes, and calls route to the wrong place. Or worse, they fail silently.

The second crack appeared at scale. Every single call now required a database query before it could proceed. The database became the bottleneck for the entire telephony stack. We were adding caching layers, optimizing queries, all to support something that should have been simpler.

The third crack was the fatal one: customers sharing number pools. Call centers do this constantly. Ten customers might share the same set of DIDs with complex routing rules. Time-based routing. Load balancing. Failover chains. Our simple lookup table could not express any of this.

We needed a different approach. Something fundamental had to change.

The Insight That Changed Everything

After weeks of debugging call routing failures—weeks of staring at SIP traces and wondering why calls ended up in the wrong place—we discovered something surprising.

The tenant information was already in the call. We just kept throwing it away.

When a carrier sends a SIP INVITE, that INVITE includes the original destination domain. If the carrier was configured to route calls for customer-a.com to your server, the INVITE arrives with customer-a.com right there in the headers. The answer was staring at us the whole time.

The problem? SIP proxies love to rewrite headers. They cannot help themselves. By the time the call reaches your application layer, the original domain has been normalized, parameterized, or replaced entirely. Every proxy in the chain thinks it is being helpful by cleaning things up.

Our solution turned out to be almost embarrassingly simple: capture the destination domain at the very edge of our infrastructure, before any proxy can touch it, and carry it through the entire call path as a custom header that nothing will modify.

Think of it like writing the apartment number on the inside of a package. The shipping label might get torn off during transit, but the number written inside survives the journey.

How the Routing Actually Works

The architecture has three layers, and each one has a specific job. Let me walk you through how a call flows from carrier to customer webhook.

Layer 1: Kamailio at the Edge

At the edge sits Kamailio, an open source SIP proxy that has been handling production telephony traffic since 2005. Battle-tested does not begin to describe it. When an INVITE arrives, Kamailio extracts the host from the original destination URI and writes it into a custom header we call X-To-IP. Then it forwards the call to LiveKit for media handling.

Why a custom header instead of using the standard ones? Because SIP proxies are notorious for modifying standard headers. They add parameters, strip display names, normalize formats. They have opinions about what headers should look like. But custom headers? Those pass through untouched. Every proxy in the chain ignores what it does not recognize.

Layer 2: LiveKit for Media and Metadata

LiveKit receives the call and creates a room for the conversation. Here is where it gets interesting: LiveKit exposes every SIP header as a participant attribute. When the SIP participant joins the room, our X-To-IP header becomes accessible as sip.h.x-to-ip. Phone numbers, call IDs, custom headers—everything travels along as metadata attached to the participant.

This is powerful because it means the routing information survives the transition from SIP world to WebRTC world. The call can be handled by modern real-time infrastructure while preserving the context we need for multi-tenancy.

Layer 3: Sayna for Tenant Routing

When something happens in the call—participant joined, participant left, room created—LiveKit emits a webhook. These webhooks contain all the participant attributes, including our carefully preserved routing domain.

Sayna receives these webhooks and performs the actual tenant routing. We extract the domain from sip.h.x-to-ip, look it up in our hook table, and forward the event to the correct customer webhook. If the custom header is missing for some reason (it happens), we fall back to sip.h.to as a backup.

The entire forwarding happens asynchronously. We acknowledge the LiveKit webhook immediately, then route in the background. This keeps LiveKit healthy and prevents any tenant routing issues from affecting call quality. The call keeps working even if the routing layer hiccups.

flowchart LR
    A["Carrier INVITE<br/>to customer-a.com"] --> B["Kamailio<br/>Extracts domain<br/>Adds X-To-IP"]
    B --> C["LiveKit<br/>Exposes as<br/>sip.h.x-to-ip"]
    C --> D["Sayna<br/>Routes to<br/>tenant webhook"]
    D --> E["Customer A<br/>receives event"]

The Hook Table: Keeping It Simple

The mapping from domains to webhooks is deliberately, almost stubbornly, simple. Each entry has two fields: a host pattern and a webhook URL.

When a domain like customer-a.com arrives, we look it up in the table and find https://api.customer-a.com/webhooks/voice. The event gets forwarded there, signed with HMAC-SHA256 so the customer can verify it actually came from us.

Hosts must be unique. Each domain routes to exactly one webhook. If you need multiple webhooks for the same domain—maybe you want events going to both your main system and an analytics pipeline—that is your job to fan out on your end. We send it once, you distribute it however you need.

Hooks can be configured at startup from a config file, or modified at runtime through the API. Runtime changes take effect immediately for new events. Calls already in progress keep their resolved destination—we do not change routing mid-call.

Why keep it so simple? Because simple is debuggable. When something goes wrong at 3 AM (and in telephony, something always goes wrong at 3 AM), you want to be able to trace the problem in minutes, not hours.

Parsing the SIP Format Zoo

If you have worked with SIP, you know that headers come in more formats than anyone wants to deal with. Our domain extraction handles all of them because it has to.

A plain URI like sip:user@example.com yields example.com. A display name format like "Company Name" <sip:user@example.com> also yields example.com—we strip the display name and angle brackets. URIs with parameters like sip:user@example.com;user=phone;tag=xyz still yield example.com after we strip everything after the semicolon. Hosts with ports like example.com:5060 yield example.com. Subdomains like sip-1.example.com stay exactly as received.

All matching is case insensitive. EXAMPLE.COM and example.com route to the same place, because nothing good comes from case-sensitive hostname matching.

If we cannot parse a valid domain from either header, we log a warning but do not fail the webhook. The call continues normally—we just cannot forward events to a tenant webhook. The failure is silent to the user but visible in our logs.

What This Looks Like in Practice

Let me paint a concrete scenario, because abstract architecture only makes sense when you can see it working.

You are building a voice AI platform for medical practices. Dr. Smith's office uses calls.drsmith.com as their SIP domain. City Dental uses calls.citydental.com. Family Medicine uses calls.familymed.io. Three customers, three domains.

All three domains point to your Sayna deployment via DNS. Same servers, same infrastructure. When a patient calls Dr. Smith's office, the carrier sends an INVITE to calls.drsmith.com. Kamailio captures this domain before anything can modify it and passes it through as a custom header. LiveKit preserves it as participant metadata. Sayna sees calls.drsmith.com in the webhook and forwards the event to Dr. Smith's configured endpoint.

Each practice has their own AI assistant, their own appointment system, their own data isolation. From their perspective, they have a dedicated voice AI platform. From your perspective, you run one deployment that serves all of them.

Now imagine scaling to fifty practices. Or five hundred. The operational overhead stays constant. One deployment, one monitoring stack, one set of infrastructure to maintain. Your Terraform files stop multiplying.

That is the promise of multi-tenancy done right. The complexity stays on your side, invisible to customers.

The Failure Philosophy

Telephony has taught us something important, something that took a few production incidents to really internalize: calls matter more than features.

If our routing breaks, calls should still work. The business logic might not get triggered—maybe the appointment booking does not happen, maybe the analytics miss an event—but the basic communication still functions. A patient can still talk to a receptionist. That is what matters.

This is why we fail open, not closed.

If a domain has no matching hook, we log a warning but acknowledge the webhook to LiveKit. The call continues. If a tenant webhook times out, we retry with exponential backoff, then log the failure and move on. We do not block. We do not hang. If domain extraction fails completely, same pattern.

Tenant routing is an enhancement, not a critical path. We never let routing failures block the main webhook handler or degrade call quality. The call is sacred.

Setting Up Your Carriers

For this system to work, your carriers need to route calls to tenant-specific domains. This requires some coordination, but it is not complicated once you understand the model.

The Clean Approach: Per-Tenant Domains

Each customer gets their own domain like calls.customer-a.com. DNS points all of them to your SIP edge—same IP address, different domain names. The carrier routes DIDs to the appropriate domain based on ownership. When the carrier provisions a phone number for Customer A, they configure it to route to calls.customer-a.com.

Tenant identification happens at the DNS level with no database lookups required. The domain is the tenant identifier. Simple, elegant, scalable.

The Fallback: Shared Domain with DID Mapping

Some carriers cannot route to arbitrary domains. They have one destination configured and everything goes there. In those cases, you fall back to a shared domain with DID mapping. All calls come to sip.yourplatform.io, and you maintain a separate lookup from phone numbers to tenants.

This works, but it requires keeping that mapping synchronized—which is exactly the operational headache we designed the system to avoid. Use it when you have to, not because it is easier to set up.

Most enterprise deployments use the per-tenant domain model. It scales better and puts tenant identification at the edge where it belongs.

Scaling Properties

The tenant router is stateless by design. It maintains no call state, no session information, nothing that needs to persist between requests. Every webhook is processed independently based on the data it contains.

This means horizontal scaling is trivial. Run multiple Sayna instances behind a load balancer. Webhooks land on any instance and route correctly. There is no sticky session requirement, no shared state to synchronize.

The hook table needs to be consistent across instances. We handle this through either file-based configuration with periodic polling, or API-based updates that propagate immediately to all instances. For most deployments, the file-based approach is simpler. For deployments where tenants self-provision, the API approach makes more sense.

For very large deployments with thousands of tenants, you might want to cache the hook table in memory with a backing store like Redis. But honestly, for most use cases, a simple in-memory map loaded from a config file works fine. Do not over-engineer it until you have to.

What We Log

Operating this at scale requires visibility. You cannot debug what you cannot see.

Every forwarded webhook is logged with the source domain, destination URL, HTTP response code, and latency. You can trace any event from LiveKit through to the tenant webhook. When a customer reports missing events, you can grep the logs and tell them exactly what happened.

We emit metrics for forwarding success rate, latency percentiles, retry counts, and domain resolution failures. These feed into whatever monitoring stack you prefer—Prometheus, Datadog, whatever you already use. We do not prescribe the tooling.

The most common issue we see in production is carriers sending to wrong domains because of DNS misconfigurations. Someone updates their DNS records, forgets to update the carrier configuration, and suddenly calls route to the wrong tenant. The logs make this obvious immediately: you see the domain that arrived versus the domain you expected. Usually takes five minutes to diagnose and fix.

Getting Started

If you want to implement this pattern yourself, the order matters. Do not skip steps or you will spend hours debugging something that should have worked.

Step 1: Configure Kamailio to extract the destination host and write it into X-To-IP. This is the foundation that everything else depends on. If you get this wrong, nothing downstream will work.

Step 2: Connect Kamailio to LiveKit SIP. Standard configuration, nothing special needed for multi-tenancy. LiveKit does not care about tenants—it just passes the headers through.

Step 3: Point LiveKit webhooks at Sayna and configure the shared secret for signature verification. This lets Sayna trust that webhooks actually came from LiveKit.

Step 4: Add hook entries for your tenants. Map each domain to its webhook URL. Start with one tenant, verify it works, then add more.

Step 5: Test with real calls. Verify the domain flows through correctly and events arrive at the right webhooks. Do not skip testing—make actual phone calls.

The whole setup can be done in a few hours if you know the components. Ongoing maintenance is minimal because the design is so simple. Most weeks, you will not think about routing at all. That is how it should be.

The Tradeoffs

This approach is not perfect for every situation. Nothing is. Here are the limitations we live with.

If your tenants cannot control their carrier routing—maybe they are using consumer VoIP providers that do not offer domain-based routing—the per-tenant domain model does not work. You fall back to DID mapping with all its synchronization headaches.

If you need to route based on complex rules beyond simple domain matching—time-of-day routing, percentage splits, conditional logic—you will need to extend the hook table or add a rules engine. Our current model is deliberately simple. We chose not to build a complex routing engine because most customers do not need one.

If you have tenants with extremely high webhook volumes, the asynchronous forwarding can create backpressure. Your webhook handler might fall behind. You might need dedicated forwarding infrastructure for your largest customers, or implement per-tenant rate limiting.

We made these tradeoffs consciously. The simple model covers the vast majority of use cases. Complexity can be added where genuinely needed. But we refuse to make the common case complicated to support edge cases that might never happen.

The Bottom Line

Multi-tenant SIP routing is a solved problem if you preserve the right information at the right time.

Capture the tenant domain at the edge, before any proxy can modify it. Carry it through the entire call path as a custom header. Route events based on domain lookup at the webhook layer.

One deployment serves all your customers. Each customer sees only their own traffic. The operational overhead stays constant as you scale from three customers to three hundred.

Our open source project implements this pattern. You can deploy it today and stop copying Terraform files for every new customer. Stop spinning up isolated clusters. Stop multiplying your operational burden.

The infrastructure should be boring. It should fade into the background. That is when you know it is working correctly.

Build your voice AI features. Let the routing handle itself.