A receptionist that picks up every call, in under 800 milliseconds.
Avyra Voice AI is a self-hosted phone agent built for dental clinics, restaurants, and any business that drops calls after 5pm. Patient calls in, gets a warm greeting, books an appointment, and only a human picks up when a human should.
What it does in production.
Inbound phone handling
Asterisk 20 + PJSIP terminate the call. Works with Twilio, Bandwidth, Vonage, or any SIP trunk.
Books on Google Calendar
Reads availability, holds slots, books, reschedules, and cancels. PMS adapter is pluggable for clinic-specific systems.
RAG for FAQs
ChromaDB indexes your clinic FAQs. The agent answers from your content, not the model's training set.
Claude or local Llama
`LLM_PROVIDER=claude` for hosted, or `ollama` for fully self-hosted on your own GPU. Same prompts, same behavior.
Emergency triage
Detects dental and medical emergencies, follows your written triage protocol, and warm-transfers to staff.
Self-hosted, no per-minute fees
Run it on a $380/mo g4dn.xlarge. No Voiceflow, no Vapi, no per-minute AI charges.
Full observability
Prometheus metrics for calls, intents, STT/TTS/LLM latency. Postgres call logs. Redis session state.
Custom voice training
Fine-tune XTTS v2 on 3–4 hours of recorded voice to make Avyra sound like a specific person on your team.
How it's built.
- Asterisk 20 LTS handles SIP / PJSIP termination and DTMF fallback menus.
- ai-engine orchestrates STT → intent classification → slot filling → LLM → TTS.
- Postgres for call logs and audit; Redis for real-time session state.
- Deploys end-to-end on AWS via included Terraform (g4dn.xlarge ≈ $380/mo).
- Prompts live as Markdown files — no redeploy to tweak agent behavior.
- Multi-tenant by `clinic_id` — one box can run many practices.
Inbound Call
│
▼
Asterisk 20 (SIP / PJSIP)
│ ARI WebSocket
▼
ai-engine (FastAPI)
├─ STT : faster-whisper distil-large-v3
├─ LLM : Claude Sonnet / Ollama Llama 3.1
├─ Cal : Google Calendar API
├─ RAG : ChromaDB
└─ TTS : XTTS v2 (custom voice)
│
▼
Asterisk plays audio back to callerWho runs it today.
Dental clinics
Front-office receptionist that books, reschedules, cancels, and warm-transfers emergencies. Indistinguishable from human voice.
Restaurants
Phone ordering at peak hours when staff can't grab the line. Quotes wait times, takes pickup orders, hands to POS.
Service businesses
Anything where missed calls = lost revenue. HVAC, salons, vet clinics, auto shops — the brain only changes prompts.
Questions we get asked.
How is this different from Voiceflow or Vapi?+
Those are SaaS with per-minute pricing and zero control over latency, voice, or the model. Avyra runs on your hardware (or your cloud account), pays no per-minute AI fees, and lets you fine-tune both the voice and the LLM.
Can it sound like a specific person?+
Yes. Record 3–4 hours of clean voice, run the included XTTS v2 fine-tuning pipeline, drop the speaker file in, restart the TTS service.
What's the cheapest fully-local setup?+
One GPU box running Asterisk + ai-engine + Ollama with Llama 3.1 8B. No Anthropic key needed, no cloud calls. Costs scale with hardware, not minutes.
Does it handle warm transfers?+
Yes — when the LLM emits a transfer intent, ai-engine bridges the call to your staff extension via Asterisk.
Got a product to build? Tell us what you have in mind.
We kick off in days, not months. Working software in weeks. If we're not the right fit, we'll tell you up front.