STATEK Security & Sandboxing
STATEK gives agents a durable Python workspace that is sandboxed by default. That workspace can call tools, use application objects, import a limited set of modules, and mutate dbzero-backed state when your application exposes those capabilities.
Treat that as a controlled runtime, not as a complete security boundary by itself. statek.init() defaults to restricted=True, which keeps python_sandbox_mode="restricted" unless you explicitly disable it. STATEK persists execution state and history, and restricted mode limits model-written Python. Your application and deployment still own authorization, secrets, resource limits, tenant boundaries, and external side-effect policy.
Do not give an agent broad Python access to production data, credentials, networks, files, or side-effecting tools without application-level controls. STATEK restricted mode is a defense layer for executed Python; it is not tenant isolation, secrets management, process isolation, or exactly-once external side effects by itself.
What STATEK Records
STATEK can persist the execution record that makes agent work inspectable:
- job status and continuation fields
- Python locals in the job's
PyEnv - console output
- chat history, tool calls, and tool results
- errors and usage metadata where available
- references to dbzero-backed application objects
This is useful for audits and recovery decisions. It is not a substitute for application authorization logs, security monitoring, provider-side logs, backups, or external system receipts.
Restricted Mode
STATEK runs model-written Python through restricted mode by default. Restricted mode uses RestrictedPython plus STATEK policy checks before code is executed by exec_step(...) and related worker paths.
Restricted mode blocks common host escapes such as unrestricted imports, open(...), eval(...), exec(...), compile(...), globals(...), locals(...), getattr(...), type(...), dunder access, unsafe builtins, and hidden or internal tool calls that are not explicitly allowed. It also limits accepted source size and AST size before execution.
The default sandbox import surface is intentionally small:
datetimecalendarremathdecimalfractionsstatisticscollectionsitertoolsfunctoolsoperatorjson
Initialize STATEK normally to keep restricted mode on:
import dbzero as db0
import statek
db0.init(dbzero_root="./statek-data", restricted=True)
statek.init() # restricted=True by default
statek.open_prefix("/acme/triage/dev/jobs", "rw")When restricted mode is active, STATEK validates that initialized dbzero is also restricted. statek.open_prefix(...) forces opened prefixes to restricted=True; passing restricted=False while STATEK is restricted raises an error. Standard dbzero restricted mode limits reflection-style access to memo objects, so agent-visible objects can expose their public API without also exposing Python or dbzero metadata.
For stronger application data controls, dbzero-pro can add protected fields and data filtering predicates. Use those when different tenants, accounts, roles, or execution contexts need different field-level or row-level access to persisted objects. See dbzero-pro and dbzero Security.
Disable restricted mode only for trusted local development or another isolation layer that you operate yourself:
import statek
statek.init(restricted=False)or with configuration:
STATEK_PYTHON_SANDBOX_MODE=offTune the built-in sandbox with:
STATEK_PYTHON_SANDBOX_MAX_SOURCE_BYTES=200000
STATEK_PYTHON_SANDBOX_MAX_AST_NODES=20000
STATEK_PYTHON_SANDBOX_ALLOWED_IMPORTS=datetime,calendar,re,math,json
STATEK_PYTHON_SANDBOX_ALLOWED_TOOLS=STATEK_PYTHON_SANDBOX_ALLOWED_TOOLS is for explicitly allowing hidden or internal tool names in the restricted execution policy. Use it sparingly; a tool that reaches files, networks, credentials, or external systems is still a capability that needs application-level authorization.
Code Execution Boundary
Agents execute Python against the objects, functions, packages, files, and network routes available in the worker process. Restricted mode narrows the Python language, imports, and tool exposure, but it does not replace process, filesystem, network, CPU, memory, or tenant isolation.
Use process-level isolation appropriate to your deployment. At minimum, decide:
- which Python modules and application packages are available to the worker
- which files and directories are readable or writable
- whether network access is allowed, and to which hosts
- CPU, memory, disk, process, and wall-clock limits
- how runaway jobs are stopped and inspected
- how dependency installation and generated code are controlled
STATEK does not currently document a production tenant-isolation guarantee. If multiple tenants share infrastructure, design the isolation boundary outside the agent job: separate prefixes, workers, credentials, process sandboxes, network policy, and authorization checks as needed.
Tools and Permissions
Treat tools as capability boundaries. A visible tool should mean: this agent may consider calling this function for this job.
Prefer narrow tools:
submit_refund_request(order, reason, approved_by)over broad tools:
run_admin_operation(anything)For side-effecting tools, enforce permission checks in application code. Do not rely on prompt wording alone. The tool should validate current user, tenant, object ownership, approval state, and idempotency keys before writing durable state or touching an external system.
Hidden and internal tools are useful for adapters and plumbing. Restricted mode blocks hidden tool calls unless explicitly allowed, but hiding a tool is still not an authorization system. If a tool is reachable in the execution environment, treat it as a capability that needs its own permission checks.
STATEK-ACL can filter class attributes, properties, and methods from generated docs shown through docstr(...) and related formatting paths. Use it to reduce accidental model exposure in documentation, not as an authorization control. A hidden doc entry can still be reachable from Python if the object itself is in the runtime.
Secrets and Provider Credentials
Keep provider keys, service credentials, tokens, and broad database handles out of agent-visible locals, prompt text, warmup output, console output, examples, and documents.
Prefer environment variables or a managed secret system for provider credentials. Pass only narrow adapters or tools into the agent workspace, and design those adapters so they do not expose raw secrets through repr(...), exception messages, or printed debug output.
Provider configuration does not sandbox agent code. Model-provider keys control model calls; they do not limit what the Python process can do with application objects or external systems.
Side Effects
External side effects need application policy:
- durable request or receipt IDs
- idempotency keys
- current-state checks before writes
- explicit approval gates for sensitive actions
- retries that can tolerate duplicate delivery
- compensating actions when reversal is possible
- audit logs that connect user intent, approval, tool call, and external receipt
STATEK can show what the agent ran and where it paused. It does not guarantee that an email, payment, API request, file write, or external calendar mutation is reversible, exactly-once, or deterministic.
Prompt and Warmup Safety
Prompts, examples, documents, and warmup code change the code an agent is likely to write. Review them like executable configuration.
Keep secrets out of prompt files. Keep warmup output focused on the task. Avoid dumping raw user profiles, credential-bearing objects, large private records, or unrelated tenant data into the job history.
If a dispatcher or parent job creates child jobs, pass only the context and tools the child needs. A child job has its own durable history and should not receive broad parent state by default.
What STATEK Does Not Provide By Itself
STATEK does not, by itself, provide:
- complete process, filesystem, network, CPU, or memory isolation
- tenant isolation guarantees
- secrets management
- authorization policy
- data-loss prevention
- exactly-once external side effects
- deterministic replay of third-party systems
- provider quota or spend controls
- backups and retention policy
Those controls belong in your application, infrastructure, and operational process.
Where to go next
Read Tools for capability design, Configuration for provider keys and limits, Operations for worker deployment boundaries, and Replay and Recovery for side-effect recovery patterns.