Anyone operating an LLM inference API with API key authentication -- whether a direct provider or an aggregator -- should consider supporting scoped JWT tokens. DeepInfra already does this well. The pattern is general and solves real problems that the rest of the industry is working around with proxies and key management sprawl.
Organizations that distribute LLM API access to their users (universities, SaaS platforms, dev teams) currently have two options:
- Give each user a real API key via a management API. This works, but the organization loses control the moment the key leaves their hands. Keys can be shared, leaked, or used in ways the organization didn't intend. Revoking a key often destroys its analytics history. And provisioning keys is a heavyweight operation -- there's no cheap way to issue thousands of ephemeral credentials.