qwen-local: Running an OpenAI-Compatible Model Service on Apple Silicon

qwen-local is a thin FastAPI service around MLX models for chat, embeddings, Qwen3 text-to-speech, and Whisper speech-to-text on a 16 GB Mac.

May 24, 2026 · 3 min · 503 words · Jack Yu

Tailgate: A Private AI Gateway for Local and Remote Models

Tailgate is my private OpenAI-compatible gateway for routing AI clients across local and hosted providers.

May 24, 2026 · 3 min · 499 words · Jack Yu