Open-source LLM/VLM load balancer and serving platform for self-hosting LLMs (and VLMs) at scale ππ¦ Alternative to projects like llm-d, Docker Model Runner, etc but with less moving parts and simple deployments built around ggml ecosystem. Runs on CPU and GPU.
Are you the creator of this tool? Claim your listing β and earn 85% of every sale.
More ai-agent tools founders pair with this one.