TODO: Comparing inference routing approaches for production agent systems.

The problem

When you're running agents at scale, you need to route inference requests somewhere. The options:

  1. llm-d — TODO
  2. OpenRouter — TODO
  3. SGLang — TODO
  4. Homebrew — TODO

llm-d

TODO

OpenRouter

TODO

SGLang

TODO

Homebrew solutions

TODO

When to use what

TODO

My take

TODO