STATEK
Metering

STATEK Metering

STATEK records model usage on each durable job.

Use this data to inspect how much context a job sent, how many provider tokens were reported, and what cost STATEK can estimate from configured pricing. Treat it as operational metering for agent developers, not as invoice reconciliation or a hard quota system.

⚠️

STATEK metering is only as accurate as the provider usage fields, selected model string, and pricing metadata available to the worker. Verify your provider contract and pricing table before using these numbers for billing decisions.

What STATEK Records

Provider adapters return usage stats with each model response:

LLM_Stats(
    total_bytes_sent,
    total_bytes_received,
    cost,
    input_tokens,
    output_tokens,
    cached_tokens,
)

Worker execution accumulates those stats into job.usage:

  • total_bytes_sent
  • total_bytes_received
  • context_bytes
  • total_input_tokens
  • total_output_tokens
  • total_cached_tokens
  • total_reported_cost
  • total_cost

total_cached_tokens is included in total_input_tokens. STATEK keeps it separate so calculated cost can apply a cached-input price when one is configured.

Cost Resolution

job.usage.total_cost resolves cost in this order:

  1. If the job has valid model pricing and nonzero input or output tokens, STATEK calculates cost from token counts.
  2. Otherwise, STATEK falls back to provider-reported cost when the provider response included one.
  3. If neither source is available, total_cost is None.

Valid pricing requires both input and output prices:

non_cached = total_input_tokens - total_cached_tokens
 
cost = (
    non_cached * input_price_per_M
    + total_cached_tokens * cached_input_price_per_M
    + total_output_tokens * output_price_per_M
) / 1_000_000

If a cached-input price is not configured, STATEK uses the normal input price for cached tokens.

Harness Token Limits

STATEK_MAX_TOKEN_USAGE is a harness guardrail. It is not calculated from provider billing tokens.

The harness uses job.approx_token_usage, which is byte-based:

approx_token_usage = (total_bytes_sent + total_bytes_received) // 4

Use this limit to stop runaway jobs and oversized interactions. For provider token counts and cost estimates, inspect job.usage.total_input_tokens, job.usage.total_output_tokens, job.usage.total_cached_tokens, and job.usage.total_cost.

job.tokens_per_sec() is also approximate. It divides recorded input plus output tokens by recorded response timing across job history, excluding steps with unknown duration.

Pricing Tables

Configure a model-info directory before statek.init():

STATEK_MODEL_INFO_DIR=./model-info

When initialized, STATEK scans that directory recursively for files ending in .csv or .txt and loads rows with this header:

PROVIDER,MODEL_FAMILY,MODEL,INPUT_PRICE_PER_M,INPUT_PRICE_PER_CACHED_M,OUTPUT_PRICE_PER_M

INPUT_PRICE_PER_M and OUTPUT_PRICE_PER_M are required for a row to load. INPUT_PRICE_PER_CACHED_M is optional. Malformed rows and rows missing required prices are skipped.

Example file with illustrative prices:

PROVIDER,MODEL_FAMILY,MODEL,INPUT_PRICE_PER_M,INPUT_PRICE_PER_CACHED_M,OUTPUT_PRICE_PER_M
openai,,gpt-4o-mini,0.15,0.075,0.60
openrouter,openai,gpt-5-mini,0.20,0.10,0.80

For OpenRouter-style routing, MODEL_FAMILY lets pricing match a provider, family, and concrete model. For example, PROVIDER=openrouter, MODEL_FAMILY=openai, and MODEL=gpt-5-mini matches an OpenRouter model string with that concrete family and model.

Updating Pricing

To update pricing from files, edit the model-info table and restart or reinitialize the worker so statek.init() reloads STATEK_MODEL_INFO_DIR:

PROVIDER,MODEL_FAMILY,MODEL,INPUT_PRICE_PER_M,INPUT_PRICE_PER_CACHED_M,OUTPUT_PRICE_PER_M
openai,,gpt-4o-mini,0.15,0.075,0.60

You can also set pricing in application code:

from decimal import Decimal
 
from statek.model_pricing import set_model_pricing
 
set_model_pricing(
    "openai",
    "gpt-4o-mini",
    Decimal("0.15"),
    Decimal("0.60"),
    input_price_per_cached_M=Decimal("0.075"),
)

For OpenRouter with a model family:

from decimal import Decimal
 
from statek.model_pricing import set_model_pricing
 
set_model_pricing(
    "openrouter",
    "gpt-5-mini",
    Decimal("0.20"),
    Decimal("0.80"),
    input_price_per_cached_M=Decimal("0.10"),
    model_family="openai",
)

These examples show the format and workflow only. Keep your real pricing table versioned and reviewed with the same care as provider configuration.

Inspecting Usage

For operational dashboards or job inspection, start with:

usage = job.usage
 
print(usage.total_input_tokens)
print(usage.total_output_tokens)
print(usage.total_cached_tokens)
print(usage.total_reported_cost)
print(usage.total_cost)
print(job.approx_token_usage)
print(job.tokens_per_sec())

For practical agent design:

  • keep prompts, warmup code, examples, documents, tool outputs, and console output compact
  • avoid returning raw logs or full API payloads to the model when a summary is enough
  • use lower-cost models or difficulty mappings for routine jobs
  • remember that pricing follows the concrete selected model, not just the difficulty label
  • monitor both byte-based harness usage and provider token/cost fields

Do not use STATEK metering as exact billing, provider invoice reconciliation, tenant quota enforcement, or a replacement for provider budget controls.

Related pages: Model Providers, Harness Policies, Configuration, and Operations.