STATEK Callbacks and Interruptions
STATEK jobs do not have to finish in one uninterrupted run.
A job can make progress, persist its Python state, wait for something outside the process, and continue later. That outside event might be a user message, a human approval, a webhook, a completed background operation, or a child job reporting back to its parent.
The important idea is simple: interruption does not mean starting over. The job keeps its durable Python workspace.
user
calendar
meeting
proposed_slotWhen external input arrives, the same job can continue with those objects still available:
if approval.accepted:
meeting.move_to(proposed_slot)
print(meeting.starts_at)
else:
print("No change made.")What Counts as an Interruption
An interruption is any moment where the agent cannot or should not continue immediately.
Common examples:
- the user sends another message into an existing job
- a human reviewer approves or rejects an action
- a webhook reports that an external system finished work
- a child job completes and notifies its parent
- a dispatcher receives an event and creates a new job with relevant context
- a future or external condition is not ready yet
In classic agent frameworks, this often becomes application-specific control flow around model calls. In STATEK, the job itself is durable. The interruption can become part of the job's persisted history and continuation state.
User Messages Into an Existing Job
A completed or active job can receive new input. Conceptually:
# Existing durable job
user
calendar
last_answer
# Later, the user sends another message
message = "Move it one hour later instead."STATEK can push the message into the job. Depending on chat style, that input is stored in the chat history or in the job's PyEnv.push_log. If the job was DONE, the new message can restart it so the agent can continue from the same durable context.
That means the agent can respond to the follow-up without rediscovering everything:
new_slot = calendar.find_empty_slot(after=meeting.starts_at)
meeting.move_to(new_slot)The previous meeting variable can still be meaningful because it belongs to the job's persisted Python state.
External Events Into New Jobs
Sometimes an event should not continue an existing job. It should start a new one.
For example, a dispatcher or queue worker might receive an event:
eventThen it prepares a new job with useful local context:
user = event.user
message = event.message
timestamp = event.created_at
calendar = user.calendarThe working agent starts with those objects already visible. This is the other side of the same model: external events can either resume existing durable work or create a fresh durable Python workspace for new work.
Human Approval
Human review is a natural callback use case.
Conceptually, the agent can prepare a proposed action:
meeting = calendar.find_meeting("Weekly planning", day=today)
proposed_slot = calendar.find_empty_slot(after=meeting.ends_at)
summary = describe_move(meeting, proposed_slot)Then the job waits for approval instead of performing the side effect immediately:
approval = wait_for_human_approval(user, summary)When the approval arrives, the job continues:
if approval.accepted:
meeting.move_to(proposed_slot)
else:
print(approval.reason)The exact approval transport is application-owned. It might be a web UI, an email link, a chat message, or a webhook. STATEK's role is to keep the job state durable and make continuation explicit.
Webhooks and External Systems
External systems often report completion later:
request = start_export(dataset)The export service might call back minutes later with:
export_resultYour application can then deliver that result to the relevant job or create a new job to handle it. The key design choice is whether the event belongs to existing work or starts new work.
Use an existing job when the event completes a task already in progress:
report = build_report(export_result)
print(report.url)Use a new job when the event is a new unit of work:
user = export_result.owner
dataset = export_result.dataset
timestamp = export_result.completed_atSuspended Jobs
Some waits are represented as suspended jobs.
When Python execution reaches a future-like value that is not ready, STATEK can store:
awaited_result: what the job is waiting fornext_instr_num: where execution should continue- status
SUSPENDED
Later, the job loop checks whether the awaited condition is ready. If it is ready, the job moves back to STARTED and continues from the stored point.
This is lower-level continuation machinery. Use it to understand how durable waiting works; detailed futures behavior is covered separately.
For the advanced API behind these waits, see Temporal Tools. Temporal tools are useful for bounded Python-level waits, but callbacks and events should remain the default shape for product-facing interruptions.
Parent and Child Jobs
Delegated work is another kind of interruption.
A parent job can hand work to a child job and continue later when the child reports completion. Conceptually:
analysis = ask_specialist(dataset)The parent should not block the whole process while the child works. The child job has its own durable Python workspace and history. When it finishes, the parent receives a notification and can continue:
if analysis.ok:
report.add_section(analysis.result)
else:
report.add_note(analysis.error)The full delegation API belongs on the Subtasks page. The callback idea is that child completion is just another durable event that can resume parent work.
Do Not Block the Process
Do not model long waits as sleeping Python code:
# Avoid this shape for long external waits
time.sleep(3600)That ties up runtime resources and does not express what the job is waiting for.
Instead, make the wait explicit:
approval = wait_for_human_approval(user, summary)or handle the next event when it arrives:
message = incoming_user_messageThe practical rule: if a wait depends on a person, external service, child job, webhook, or queue, turn it into durable state plus a later event.
Failure and Timeout Policy
Callbacks and interruptions need ordinary distributed-system discipline.
Design for:
- duplicate events
- late events
- missing events
- deleted or completed jobs
- stale approvals
- external systems retrying webhooks
- side effects that may already have happened
Make callback handlers idempotent. Store external event IDs when possible. Check current application state before writing. Prefer durable status fields such as pending, approved, rejected, completed, or expired over relying only on transient process memory.
Timeouts are application policy. A job can persist that it is waiting, but your application should decide what happens if the answer never arrives: remind a user, escalate, cancel, or mark the work expired.
Callbacks resume durable state, but they do not make external side effects deterministic or automatically safe. Production systems need authentication, authorization, idempotency, audit logs, secrets discipline, resource limits, and explicit approval rules. See Security for the runtime boundary.
How to Think About the API Boundary
STATEK provides the durable job model, push/event processing paths, suspended-job continuation fields, and parent notification mechanics. Your application owns the public product surface around those mechanisms: web routes, queue consumers, approval screens, webhook validation, and user permissions.
At the conceptual level, all interruption paths do one of two things:
- deliver new information to an existing job
- create a new job with the right Python objects in context
Both paths preserve the core STATEK model: agents execute Python, jobs keep durable state, and dbzero-backed objects are ordinary Python application state.
Where to go next
Read Temporal Tools for advanced Python-level waits, Futures for lower-level durable waiting concepts, Subtasks for child-job notifications, Operations for worker loops, and Replay and Recovery for side-effect recovery policy.