Audio-reactive presence · Production

The face of the room — wordmark, wave rings, and synchronized voice.

Avyra Avatar is a full-screen browser visualizer designed to be the visible side of the room. The AVYRA wordmark sits inside concentric rings that idle gently, ripple in sync with TTS audio, and pulse outward when someone enters. One tab, one source of audio, zero lipsync drift.

Deploy Avyra Avatar Back to suite

Canvas / HTML

Render

Browser tab

Audio source

<1s

Greet latency

Lock to TTS

Sync

Features

What it does in production.

Wave-ring visualizer

Concentric rings idle-animate when nothing's happening, ripple in time with synthesized speech, and pulse outward on entry events.

Plays its own TTS

Proxies `/api/tts` to the avyra-tts service and plays audio through the browser tab. Single audio source — no overlapping speakers.

Listens to vision events

Subscribes to Redis presence events via the included FastAPI/WebSocket bridge. Walk in front of the camera, rings ripple, greeting plays.

Trigger endpoint for testing

`POST /api/trigger-greet` publishes a synthetic entered event. Isolate the avatar chain without standing up a camera.

Two-flag handoff contract

Coordinates with the vision service via `ENABLE_GREETER` (vision) and `AUTO_GREET` (avatar). One is on at a time — no overlapping audio.

Cooldown built in

`GREETING_COOLDOWN_SEC` suppresses repeat greetings so a group walking in together gets one welcome, not five.

Runs on a cheap screen

Drop it on any HDMI display with a browser pointed at `http://room-pc:9000`. Click the start overlay once. Done.

No surveillance

The avatar tab never sees a video frame. It only knows that someone entered — and what to say back.

Architecture

How it's built.

FastAPI backend serves the visualizer, proxies TTS, and bridges Redis → WebSocket.
Browser plays audio only after a one-time user gesture (start overlay) — required by every modern browser.
Auto-greet can be disabled so the avatar is silent and the host speaker handles audio.
Configurable Edge TTS voice (default `en-US-AvaMultilingualNeural`) for non-XTTS deployments.
Single port, no exotic deps — `python -m src.main` and you're running.

avyra-vision  ─► Redis  ─►  WebSocket bridge  ─►  browser tab
                                                       │
                                                       ▼
                                          canvas wave rings
                                             + /api/tts  ─►  avyra-tts

Use cases

Who runs it today.

Reception screens

The visible 'Avyra' on the wall behind the counter — idles, ripples to TTS, pulses when a guest arrives.

Demo installations

Standalone with `POST /api/trigger-greet` — perfect for trade shows and lobby demos with no camera wired up.

Composed room

Paired with vision + room-bridge + voice for full two-way conversation. Avatar is the lip-synced face of the brain.

Built withPython 3.10+FastAPIWebSocketsEdge TTS / XTTS v2RedisHTML Canvas

FAQ

Questions we get asked.

Why a browser tab and not a native app?+

A browser is the cheapest, most portable display surface — works on any HDMI screen with a Chromium tab. The visual is built on Canvas; no special runtime required.

What's the start overlay?+

Browsers block audio without a user gesture. The first time you open the tab, click anywhere — audio unlocks for the session. It's a one-time tap, not a per-event prompt.

Can two devices speak at once?+

Don't run vision's `ENABLE_GREETER` and avatar's `AUTO_GREET` both true — you'll get overlapping audio. The recommended setup is avatar speaks, vision is silent.

Ready when you are

Got a product to build? Tell us what you have in mind.

We kick off in days, not months. Working software in weeks. If we're not the right fit, we'll tell you up front.

Start a project Featured work · Avyra