OpenAI and Anthropic compatible server for Apple Silicon. Run LLMs and vision-language models (Llama, Qwen-VL, LLaVA) with continuous batching, MCP tool calling, and multimodal support. Native MLX backend, 400+ tok/s. Works with Claude Code.
+6 stars 24h | 0 7d
0 in 24h | 0 sources
0/5 channels firing
Each channel contributes 0-1. Per-channel tiers: GitHub (breakout 1.0 / hot 0.7 / rising 0.4), HN (front-page 1.0 / ≥3 mentions 0.7 / 1-2 mentions 0.4), Bluesky (≥5 mentions 1.0 / 2-4 0.7 / 1 0.4), dev.to (≥3 articles 1.0 / 2 0.7 / 1 0.4), Reddit (corpus-normalized 48h velocity).
No mentions on this channel in the last 7 days.
// quiet here doesn't mean the repo is dead — check the other tabs
The agent that grows with you
LLM inference server with continuous batching & SSD caching for Apple Silicon — managed from the macOS menu bar
🪨 why use many token when few token do trick — Claude Code skill that cuts 65% of tokens by talking like caveman
no linked package yet
last commit 3h ago
* Reddit bar shows a per-repo velocity proxy (raw score / 100); the score formula uses the corpus-normalized version so a single repo's bar may not match its contribution to the corpus-wide ranking.
Known repo, package, launch, and site surfaces.