qwen-local: Running an OpenAI-Compatible Model Service on Apple Silicon
qwen-local is a thin FastAPI service around MLX models for chat, embeddings, text-to-speech, and speech-to-text on a 16 GB Mac.
qwen-local is a thin FastAPI service around MLX models for chat, embeddings, text-to-speech, and speech-to-text on a 16 GB Mac.