# Billing through LLM Proxy Learn how to bill for LLM tokens. ## Overview Use our proxy API to call LLM models and automatically record usage per customer. We route requests to the provider, return the response, and attribute tokens by model and type for automatic customer billing. Our proxy pulls the latest token costs for all supported AI models. Specific pricing line items are listed in the pricing tables. Costs reflect current market rates as published by each model’s provider. **Publisher** identifies the organization that created and owns the model. **Provider** identifies the service that serves the model request and sets the rate used for billing. The publisher and provider aren’t always the same. A model might be published by one organization and served by a different provider. Publishers include OpenAI, Anthropic, and Google. Providers include Bedrock, Azure, and Deepinfra. ## Tokens and pricing For each inference call, providers return the number of tokens consumed. Token types exist because providers meter different parts of a request separately—such as prompt processing, generated output, or cache usage. Different model families (text, image, and multimodal) might report token usage differently, and the provider determines the token counts and corresponding rates for each type. - **Cached input:** Prompt tokens read from a cache instead of being recomputed by the model. - **Cached output:** Response tokens served from a cache instead of being newly generated. Some providers report this as a separate token type. Others include it within the cached input count. - **Input:** Tokens sent to the model in the prompt. - **Output:** Tokens generated by the model in its response. > #### Pricing update SLA > > Token prices are updated automatically to reflect the provider’s current pricing. Changes might appear with a slight delay. ## Pricing Table (Prices per 1M tokens) ### AI model pricing table The interactive pricing table is available in the web view. In plaintext and markdown exports, use the filters in the web UI to view current model pricing by publisher, provider, token type, display name, and price threshold.