Work graph for coding agents

Your agent stops losing the plot.

Continuity gives every run a living graph: the goal, the branch it opened, the evidence it produced, the debt it found, and the next task it can safely pick up.

continuity://outcome clear
before read the thread again
after resume the next runnable task
kept decisions, blockers, evidence, debt
result less babysitting, more finished work

Orientation beats drift

Stop asking humans to keep the thread alive. Give agents direction.

regular agentic coding drift

The agent follows whatever signal is loudest. It can self-steer into side quests until a human re-enters the loop.

with continuity oriented graph

Continuity gives every run provenance and direction: where it came from, what changed, and the next safe move.

Compaction survival

When the thread compresses, the graph keeps the route.

chat compaction nuance lost

Compaction keeps the headline, but the small reasons vanish: why the agent moved, what edge case mattered, and where the next run should resume.

continuity graph survives compaction

The graph is not just prose in the chat window. A future agent can inherit provenance, blockers, evidence, and next direction after the thread is compacted.

Why Linear and Jira are not enough

Traditional task managers track tickets. Agents need orientation.

what changes
Linear / Jira / tickets
Continuity
state

Human-updated status fields drift behind the actual work.

The graph is reconciled from runs, deltas, evidence, blockers, and decisions.

context

Important reasoning sits in comments, Slack, or a forgotten agent transcript.

Context is typed: goals, tasks, questions, branches, debt, and proof stay connected.

next action

A human still decides what an agent should read, trust, ignore, and do next.

The graph selects the next agent-safe slice and explains why other work is blocked.

Hard10 terminal-agent eval

10/10 with Continuity. 0/10 without it, even on GPT-5.5. Now the bar gets higher.

Harbor / Terminal-Bench-style run May 8, 2026
protocol lift +100 pts

Same task suite. Same terminal setting. The operating protocol changed the outcome.

baseline codex · GPT-5.2 0/10

Completed every run. Failed every hidden verifier.

baseline codex · GPT-5.5 0/10

The newer model ran faster and cheaper. It still solved zero.

codex + continuity · GPT-5.2 10/10

Graph orientation, stop conditions, and reconciliation held across the full suite.

GPT-5.2 baseline
0%
GPT-5.5 baseline
0%
with Continuity
100%

What changed

Not a model upgrade. A runtime advantage.

Model upgrades alone did not move the baseline: GPT-5.2 and GPT-5.5 both finished 10 clean trials and solved 0. With Continuity, GPT-5.2 solved all 10 by carrying graph state, blockers, and verification rules through the run.

tasks 10
exceptions 0
baseline models 2

Harbor provides the eval substrate. Terminal-Bench sets the standard for credible terminal-agent tasks. The full packet includes the Harbor runs, errors, costs, latency, and an interactive Codex TUI sidecar. This is a sharp signal, not the final product proof.

Open the full eval packet Read the trust protocol

Outcome

Every run should leave the project easier to continue.

01 Goal

The agent knows the outcome it is trying to create, not just the latest prompt.

02 Branch

Side quests become named future work instead of vanishing into chat history.

03 Trace

Each run records what changed, what was verified, and what still needs judgment.

04 Next

The graph points the next agent at runnable work instead of another archaeology session.

active goal finish the integration
schema merged
agent run reconciled
edge case logged
next: verify deploy
pricing waits

trace exact work delta preserved

branch follow-up task created

exit clear next action

Pricing

$9/mo hosted.

Built for solo operators who want agents to keep shipping without rebuilding context every session.

Early hosted plan $9 / month
  • Hosted work graph for coding agents
  • Goal, branch, trace, blocker, and proof tracking
  • Agent-safe next-slice selection
  • Work-delta reconciliation after each run

Early access

Join before the hosted beta.

Tell us where your agents lose the thread. The first cohort is for indie and solo developers who already feel the handoff problem.

Launch copy

Agent work should leave a graph behind.

"I'm building Continuity: a $9/mo hosted work graph for coding agents. Every run leaves the goal, branch, evidence, debt, and next task clear enough for the next agent to continue."

Draft post