STATEK Metering
STATEK records model usage on each durable job.
Use this data to inspect how much context a job sent, how many provider tokens were reported, and what cost STATEK can estimate from configured pricing. Treat it as operational metering for agent developers, not as invoice reconciliation or a hard quota system.
STATEK metering is only as accurate as the provider usage fields, selected model string, and pricing metadata available to the worker. Verify your provider contract and pricing table before using these numbers for billing decisions.
What STATEK Records
Provider adapters return usage stats with each model response:
LLM_Stats(
total_bytes_sent,
total_bytes_received,
cost,
input_tokens,
output_tokens,
cached_tokens,
)Worker execution accumulates those stats into job.usage:
total_bytes_senttotal_bytes_receivedcontext_bytestotal_input_tokenstotal_output_tokenstotal_cached_tokenstotal_reported_costtotal_cost
total_cached_tokens is included in total_input_tokens. STATEK keeps it separate so calculated cost can apply a cached-input price when one is configured.
Cost Resolution
job.usage.total_cost resolves cost in this order:
- If the job has valid model pricing and nonzero input or output tokens, STATEK calculates cost from token counts.
- Otherwise, STATEK falls back to provider-reported cost when the provider response included one.
- If neither source is available,
total_costisNone.
Valid pricing requires both input and output prices:
non_cached = total_input_tokens - total_cached_tokens
cost = (
non_cached * input_price_per_M
+ total_cached_tokens * cached_input_price_per_M
+ total_output_tokens * output_price_per_M
) / 1_000_000If a cached-input price is not configured, STATEK uses the normal input price for cached tokens.
Harness Token Limits
STATEK_MAX_TOKEN_USAGE is a harness guardrail. It is not calculated from provider billing tokens.
The harness uses job.approx_token_usage, which is byte-based:
approx_token_usage = (total_bytes_sent + total_bytes_received) // 4Use this limit to stop runaway jobs and oversized interactions. For provider token counts and cost estimates, inspect job.usage.total_input_tokens, job.usage.total_output_tokens, job.usage.total_cached_tokens, and job.usage.total_cost.
job.tokens_per_sec() is also approximate. It divides recorded input plus output tokens by recorded response timing across job history, excluding steps with unknown duration.
Pricing Tables
Configure a model-info directory before statek.init():
STATEK_MODEL_INFO_DIR=./model-infoWhen initialized, STATEK scans that directory recursively for files ending in .csv or .txt and loads rows with this header:
PROVIDER,MODEL_FAMILY,MODEL,INPUT_PRICE_PER_M,INPUT_PRICE_PER_CACHED_M,OUTPUT_PRICE_PER_MINPUT_PRICE_PER_M and OUTPUT_PRICE_PER_M are required for a row to load. INPUT_PRICE_PER_CACHED_M is optional. Malformed rows and rows missing required prices are skipped.
Example file with illustrative prices:
PROVIDER,MODEL_FAMILY,MODEL,INPUT_PRICE_PER_M,INPUT_PRICE_PER_CACHED_M,OUTPUT_PRICE_PER_M
openai,,gpt-4o-mini,0.15,0.075,0.60
openrouter,openai,gpt-5-mini,0.20,0.10,0.80For OpenRouter-style routing, MODEL_FAMILY lets pricing match a provider, family, and concrete model. For example, PROVIDER=openrouter, MODEL_FAMILY=openai, and MODEL=gpt-5-mini matches an OpenRouter model string with that concrete family and model.
Updating Pricing
To update pricing from files, edit the model-info table and restart or reinitialize the worker so statek.init() reloads STATEK_MODEL_INFO_DIR:
PROVIDER,MODEL_FAMILY,MODEL,INPUT_PRICE_PER_M,INPUT_PRICE_PER_CACHED_M,OUTPUT_PRICE_PER_M
openai,,gpt-4o-mini,0.15,0.075,0.60You can also set pricing in application code:
from decimal import Decimal
from statek.model_pricing import set_model_pricing
set_model_pricing(
"openai",
"gpt-4o-mini",
Decimal("0.15"),
Decimal("0.60"),
input_price_per_cached_M=Decimal("0.075"),
)For OpenRouter with a model family:
from decimal import Decimal
from statek.model_pricing import set_model_pricing
set_model_pricing(
"openrouter",
"gpt-5-mini",
Decimal("0.20"),
Decimal("0.80"),
input_price_per_cached_M=Decimal("0.10"),
model_family="openai",
)These examples show the format and workflow only. Keep your real pricing table versioned and reviewed with the same care as provider configuration.
Inspecting Usage
For operational dashboards or job inspection, start with:
usage = job.usage
print(usage.total_input_tokens)
print(usage.total_output_tokens)
print(usage.total_cached_tokens)
print(usage.total_reported_cost)
print(usage.total_cost)
print(job.approx_token_usage)
print(job.tokens_per_sec())For practical agent design:
- keep prompts, warmup code, examples, documents, tool outputs, and console output compact
- avoid returning raw logs or full API payloads to the model when a summary is enough
- use lower-cost models or difficulty mappings for routine jobs
- remember that pricing follows the concrete selected model, not just the difficulty label
- monitor both byte-based harness usage and provider token/cost fields
Do not use STATEK metering as exact billing, provider invoice reconciliation, tenant quota enforcement, or a replacement for provider budget controls.
Related pages: Model Providers, Harness Policies, Configuration, and Operations.