TODO: Comparing inference routing approaches for production agent systems.

The problem

When you're running agents at scale, you need to route inference requests somewhere. The options:

llm-d — TODO
OpenRouter — TODO
SGLang — TODO
Homebrew — TODO

llm-d

TODO

OpenRouter

TODO

SGLang

TODO

Homebrew solutions

TODO

When to use what

TODO

My take

TODO