Eyes for the room — under a second from arrival to greeting.
Avyra Vision watches a USB camera, runs YOLO presence detection, and publishes simple entered/left events on Redis. Anything downstream — a greeter, an avatar, the brain — just subscribes. The vision service knows nothing about bookings or conversations.
What it does in production.
Presence FSM
A finite-state machine debounces detections — no spurious 'entered' events from a flicker or a passing shoulder.
Greeter included
A built-in greeter subscriber plays a warm welcome through the existing TTS service. Toggle off when the brain takes over.
Pluggable Redis contract
Events publish on `avyra:presence:{tenant}`. Any subscriber — included or yours — gets the same JSON envelope.
Health + state endpoints
`GET /health` for liveness, `GET /state` for active person count, frame count, and current channel.
Personalized greetings
Drop in a face-recognizer that tags events with `person_id`. The channel contract stays identical.
Tunable for any room
Camera index, YOLO confidence, minimum bbox area fraction, presence-lost timeout, greeting cooldown — all from `.env`.
Standalone or composed
Run it alone with the included greeter, or wire it into avyra-room-bridge for full two-way conversation.
Privacy-first
Frames never leave the box. Only short event envelopes hit Redis. No faces stored unless you opt into recognition.
How it's built.
- YOLOv8n via Ultralytics — runs comfortably on CPU, faster on GPU.
- WebRTC-style debouncing: PRESENCE_LOST_AFTER_SEC keeps brief drops from re-firing.
- Event envelope: `{event, track_id, ts, bbox, tenant_id}` — small, predictable, parseable.
- Greeter throttles repeats by GREETING_COOLDOWN_SEC so a group doesn't overlap audio.
- Multi-tenant by `tenant_id` — one box, many channels.
- Linux deploy uses `--device /dev/video0 --device /dev/snd` Docker passthrough.
USB camera
│
▼
detector ─► presence FSM ─► Redis pub/sub
│
┌───────────────────┴───────────────────┐
▼ ▼
greeter (included) ai-engine
(booking + Q&A)Who runs it today.
Reception desks
Greet the patient or guest the moment they walk in. Free the front-desk human to handle the call queue.
Retail entryways
Trigger personalized welcomes for returning customers when paired with face recognition.
Smart conference rooms
Detect occupancy for booking systems, lighting, HVAC — without an enterprise vendor stack.
Questions we get asked.
Do you store video?+
No. The service processes frames in memory and only publishes small JSON event envelopes. If you want recognition, embeddings are stored — not pixels.
GPU required?+
No. YOLOv8n runs fine on CPU for entryway-scale detection. Set `YOLO_DEVICE=cuda` only if you've got the VRAM to spare.
Can I get personalized greetings?+
Yes — add a recognizer module that embeds faces and tags events with `person_id`. The Redis channel and envelope stay identical, so no downstream change is needed.
What if I don't have a camera?+
Use `redis-cli PUBLISH` to send synthetic events. The whole downstream chain is camera-agnostic by design.
Got a product to build? Tell us what you have in mind.
We kick off in days, not months. Working software in weeks. If we're not the right fit, we'll tell you up front.