STATEK Overview

STATEK stands for Stateful Agent Execution Kit. It gives an LLM agent a durable Python workspace.

The simplest way to understand STATEK is this:

A job starts with useful Python objects in context. The application can supply them directly, or an agent/dispatcher can pick a queued task and prepare a new job with the right context.
The agent writes and executes normal Python code.
That code can call exposed tools, use connected application objects, and work with the Python codebase you make available to it.
STATEK persists the job's Python variables, console output, chat history, tool calls, status, errors, and continuation state.
Many jobs can exist and move through this loop independently in one process, without their Python locals colliding.
dbzero persists the application objects themselves as normal Python classes.

💡

What developers implement. You provide an agent prompt, task context, allowed tools or objects, dbzero memo classes for durable state, and runtime controls. The agent combines those pieces with normal Python, so a calendar workflow can expose a calendar object and safe methods instead of a graph edge for every scheduling action.

That is the core distinction from many classic agent frameworks. The agent is not only choosing from a list of JSON-shaped tools. It can run Python steps against real Python objects and keep working with the variables it creates. At the same time, STATEK can coordinate large fleets of independent jobs, including many waiting or ready jobs on a single machine, while limiting how many execute at once.

STATEK is Apache-2.0 licensed. Commercial and internal use, modification, and redistribution are allowed when required license and copyright notices are kept.

How it works

A STATEK job is a durable Python session for an agent. Local variables can be put into that session before the model starts working.

There are two common ways that happens:

Application code creates the job and supplies the task context.
A coordinator, dispatcher, or other agent picks a queued task, such as a user message, and creates a new job with the relevant objects in context.

For example, a scheduling agent might start with these variables already visible:

user      # current user object
message   # queued user message being handled
calendar  # user's calendar object
today     # current date
timestamp # message or request time

Those names are not special STATEK APIs. They are just Python variables in the job's local context. They can point to dbzero objects, application service objects, functions, adapters, or other values your application chooses to expose.

The agent can then run ordinary Python:

from datetime import datetime
 
now = datetime.now()
print(now)

It can inspect the objects that were already in context:

print(user.name)
print(calendar.events_for(today))

It can call methods and mutate application objects:

meeting = calendar.find_meeting("Weekly planning", day=today)
empty_slot = calendar.find_empty_slot(
    after=meeting.ends_at,
    duration=meeting.duration,
)
 
meeting.move_to(empty_slot)
print(f"Moved to {empty_slot}")

And later, after another turn or a worker restart, the job can still have the variables from earlier steps:

print(user.name)
print(calendar.events_for(today))
print(meeting.starts_at)
print(empty_slot)

These snippets are conceptual. Calendar, events_for, find_empty_slot, and move_to are application code. STATEK's job is to run the agent's Python durably and keep the execution state inspectable.

Why dbzero matters

💡

dbzero lets application code and STATEK agents work against the same durable Python object graph. You define durable classes with @db0.memo, then read and write attributes directly, without building artificial API surfaces or serialization layers just so the agent can see application state.

Application code can inspect STATEK objects such as agents, jobs, queues, local state, console output, and history. In the other direction, an agent job can receive application objects in its local context and use them like ordinary Python objects, with the methods and permissions your application chooses to expose.

import dbzero as db0
 
@db0.memo
class Meeting:
    def __init__(self, title, starts_at, ends_at):
        self.title = title
        self.starts_at = starts_at
        self.ends_at = ends_at
 
    def move_to(self, slot):
        self.starts_at = slot.starts_at
        self.ends_at = slot.ends_at

From the agent's point of view, this is just Python:

meeting.move_to(empty_slot)

From the application's point of view, the mutation is durable application state. There is no separate translation step where the agent has to emit a database command, JSON patch, or workflow event just to update the object. You still decide what objects, methods, tools, credentials, and permissions the agent receives.

What STATEK persists

STATEK persists the agent execution state that matters when work spans multiple steps:

Python locals: variables such as user, calendar, today, meeting, and empty_slot
Console output: what the agent's Python printed
Chat history: user messages, LLM responses, warmup entries, and subtask notifications
Tool history: tool calls, tool results, and tool errors
Job status: READY, WARMING_UP, STARTED, SUSPENDED, or DONE
Continuation state: where execution should resume after waiting for a future, callback, or external event
Usage and errors: model usage, cost information where available, and recorded failures

Because this state is stored through dbzero, it can be inspected as application data instead of reconstructed from logs.

job.status
job.py_env.local_state["meeting"]
job.py_env.local_state["empty_slot"]
job.py_env.console
job.chat_log
job.error

Many jobs in one process

STATEK is designed to orchestrate many independent agent jobs from one process. A job is its own durable Python session: it has its own status, PyEnv, local variables, console output, chat log, tool log, continuation state, and errors.

That means one scheduling job can have:

user = alice
calendar = alice.calendar
meeting = alice_weekly_planning

while another job has:

user = sam
calendar = sam.calendar
meeting = sam_customer_call

Those variables do not collide, because they live in different job local states. The runner can scan persisted jobs, unsuspend jobs whose waiting conditions are ready, and process ready jobs concurrently up to the configured concurrency limit.

In practice, this means a single machine and a single process can coordinate large numbers of active jobs, including thousands of persisted waiting or ready jobs, while executing a controlled number at the same time. Actual throughput depends on model latency, tool latency, CPU, memory, dbzero storage, provider limits, and the max_concurrency setting.

Where STATEK fits

Use STATEK when the agent's work is not just a chat turn. It fits systems where the agent needs a durable Python workspace connected to real application state, especially when the work is data-intensive, multi-step, or coordinated across many agents and jobs.

The strongest fit is data-intensive agent work. STATEK is built on dbzero, so the agent can analyze and mutate durable Python object graphs directly. That matters when the agent needs to search, filter, aggregate, compare, update, and explain work over application state instead of only passing text through a model call.

Good fits include:

data-heavy agents that operate on large or evolving dbzero object graphs
complex analytics where intermediate datasets, calculations, selections, and explanations need to persist
agentic fleets where dispatchers, coordinators, specialists, and child jobs all keep durable state
complex context management where each job carries its own user, message, calendar, retrieved objects, derived variables, and follow-up state
systems with many independent durable jobs running, waiting, or resuming at the same time
tools and application methods that return rich Python objects, not only strings
workflows where agents inspect state, decide what to change, make the change, and leave an audit-friendly execution trail
human-in-the-loop or external-event workflows that pause and continue with the same Python variables
operational systems where teams need to inspect what the agent ran, printed, changed, delegated, or waited for

STATEK is not a universal replacement for every agent framework. For a single stateless prompt-response call, a direct model SDK is simpler. STATEK becomes useful when the agent's Python workspace, context, intermediate results, and application state need to survive beyond one request.

Safety boundaries

STATEK agents can execute Python code against the tools, objects, and code paths you expose. That is powerful, and it should be treated as a controlled runtime.

In production, use sandboxing, permissioning, secrets discipline, resource limits, approval gates for sensitive actions, and idempotency for external effects. Give each agent only the objects, methods, credentials, and data it needs for the job. See Security for the launch safety checklist.

⚠️

Durable history is not deterministic replay. STATEK can persist job state and operational history, but external APIs, files, emails, payments, and other side effects still need application-level recovery rules.

Read Quickstart to create a minimal durable job. Read Core Concepts to see how agents, job definitions, jobs, Python locals, chat logs, tools, futures, and subtasks fit together.

Snapshots, Time Travel and Isolation Quickstart