Roadmap¶
Note: this is not a confused-with
pydantic-ai-harnessproject —loopletis a framework-agnostic loop library. See README.md for the full positioning.
This document describes what looplet will and will not become.
Dates are aspirational; the only firm commitment is the v1.0 API
contract.
Guiding principles¶
- One thing well. The core product is the iterator-first tool-calling loop. Anything that dilutes that focus is out of scope.
- Composition over configuration. New behaviour ships as hooks or
protocols, not as flags on
LoopConfig. - Boring dependencies. Core runtime has zero third-party packages — the standard library only. New features land in optional extras or separate packages.
- Frozen public surface, fluid internals. Once a symbol is in
looplet/__init__.py, breaking it requires a major bump.
Current status — 0.1.x (Beta)¶
- Composable sync + async loop, hooks as
Protocolobjects - Tool registry with JSON-schema rendering and concurrent batching
- Fail-closed permission engine with ALLOW/DENY/ASK rules
- Checkpoint + resume, cooperative cancellation, multi-block messages
- Anthropic + OpenAI backends (sync, async, streaming)
- Provenance capture (LLM prompts + trajectories)
- pytest-style eval framework with CLI runner
- MCP tool adapter + skills bundles
- Decorator-first tool construction via
@toolandtools_from() - Native-tool protocol probing for OpenAI-compatible proxy mismatches
looplet doctordiagnostics for local setup and backend checks
A+ polish track¶
The next product goal is to make looplet feel obvious from the first GitHub page through the first custom agent. The short pitch should stay consistent everywhere:
looplet exposes the agent loop as an iterator, makes every step observable, and lets users compose behavior with hooks.
Custom-agent example to lead with¶
Leads with Dependency Doctor: an agent that audits a repository's
dependency files for security, license, and maintenance risk, then
produces a report card. It is more memorable than hello-world, useful
to most developers, and demonstrates looplet's differentiation: every
lookup, warning, and conclusion is visible as a Step that users can
log, gate, replay, or evaluate.
Second-line demos:
- Git Detective for repository-health analysis from commit history.
- Threat Intel Briefing for local-first security analysis.
- Coder as a reference implementation for tool-heavy agents.
API consolidation¶
Keep the low-level modules, but make the common path feel smaller:
- One front-door import story:
loopletfor essentials, submodules for advanced internals. - Group production features into opinionated bundles:
debugging,safety,coding, andresearchpresets that assemble hooks, memory, compaction, provenance, and permissions with sane defaults.
Tool construction¶
Manual ToolSpec(...) construction should remain supported, but the
happy path should be decorator-first:
from looplet import tool, tools_from
@tool(description="Search the docs by keyword.", concurrent_safe=True)
def search_docs(query: str, limit: int = 5) -> dict:
return {"results": search(query, limit)}
tools = tools_from([search_docs])
The decorator should infer JSON Schema from type hints, mark parameters
with defaults as optional, use docstrings as descriptions when no
description is provided, preserve ctx injection, and still return a
plain ToolSpec so advanced users can inspect or mutate it.
Near-term (0.2 — ~1 month out)¶
- Production bundles — opinionated
debugging,safety,coding, andresearchpreset bundles that assemble hooks and memory defaults. - Gemini + Bedrock backends (community contributions welcome — see good-first-issues)
- Structured-output helper — optional
response_schemasupport that threads through to providers that have it natively - Cost accounting hook built on top of the provenance sink
- Documentation site on GitHub Pages (mkdocs-material)
Mid-term (0.3 — ~2 months out)¶
- Loop-level retry policies as composable objects (not config flags)
- Deterministic replay — given a saved trajectory + a deterministic LLM cassette, re-run the loop bit-for-bit for regression testing
- Expanded eval library — reusable
eval_*recipes shipped aslooplet.evals.recipes(efficiency, parse-quality, IOC coverage, tool-error rate) - OpenTelemetry exporter as a first-party optional extra
Path to 1.0 (~3 months out)¶
1.0 is shipped when:
- The v1.0 API contract (below) has been in production for at least a quarter across at least three independent codebases.
- No open issue is tagged
api-designorbreaking. - Coverage ≥ 90 % and full pyright strict passes.
- Documentation site is feature-complete.
Explicitly not on the roadmap¶
These belong in other projects, not in looplet:
- A graph DSL / branching orchestrator. Use
langgraphorburr. - Multi-agent handoff protocols. Use
openai-agentsorcrewai. - A prompt-templating DSL. Use
dspyor plain f-strings. - A vector DB / memory store. Memory is a tool; plug in your own.
- A web UI / dashboard.
loopletemits events; wire any UI you want on top. - A CLI agent-in-a-box. Use
claude-agent-sdk. - Fine-tuning tooling, data pipelines, synthetic-data generation. Out of scope.
v1.0 API contract¶
These symbols and signatures are frozen from 1.0 onward. Breaking
any of them requires a major-version bump.
Loop entry points¶
composable_loop(
llm: LLMBackend,
*,
tools: BaseToolRegistry,
task: dict[str, Any],
state: DefaultState | None = None,
config: LoopConfig | None = None,
hooks: Sequence[LoopHook] | None = None,
) -> Iterator[Step]
async_composable_loop(...) # same signature, async iterator
The Step record¶
The first four fields (number, tool_call, tool_result, elapsed_ms)
are frozen. Additional fields may be added in minor versions.
The hook protocol¶
Six method names are frozen:
pre_loop(state, session_log, context)pre_prompt(state, session_log, context, step_num) -> str | Nonepre_dispatch(state, session_log, tool_call, step_num) -> ToolResult | Nonepost_dispatch(state, session_log, tool_call, tool_result, step_num) -> str | Nonecheck_done(state, session_log, context, step_num) -> str | Noneshould_stop(state, step_num, new_entities) -> boolon_loop_end(state, session_log, context, llm) -> int
All methods remain optional (duck-typed). Minor versions may add optional keyword arguments with defaults, never new required ones.
The LLMBackend protocol¶
class LLMBackend(Protocol):
def generate(self, messages: list[Message], *, tools: list[dict] | None = None,
cancel_token: CancelToken | None = None) -> LLMResponse: ...
Tool surface¶
ToolSpec, ToolCall, ToolResult, BaseToolRegistry — field names
and the register / dispatch / catalog method signatures are frozen.
Error classification¶
ToolError categories are frozen: TIMEOUT, VALIDATION,
PERMISSION_DENIED, RATE_LIMIT, CONTEXT_OVERFLOW, CANCELLED,
UNKNOWN. New categories require a major bump.
Release cadence¶
- Patch (
0.1.x): as soon as bug fixes accumulate, weekly at most. - Minor (
0.2,0.3, …): roughly monthly, with a two-week release candidate on PyPI (pip install looplet==0.2.0rc1). - Major: only when the v1.0 contract above changes, or every 12+
months after
1.0.
How to influence the roadmap¶
- File an issue tagged
roadmapwith a concrete use case. - Open a discussion under the Ideas category.
- Send a PR. The fastest way to move something forward is a working implementation behind an optional extra.