Durable Desktop Voice Path + Topic-Aware Thread Router
For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (
<input type=checkbox>) for tracking.
Context
Goal: Make the desktop voice path survive multi-turn conversations (it currently crashes on the second thread=continue turn) and add a desktop-router agent that routes each voice turn to continue / resume-older / new thread using a digest of the last 10 threads.
Architecture: Layer A strips control-plane system messages out of checkpointed channels, defensively filters mid-history system messages before every model call, repairs unpaired tool blocks across turns, and opens drafts in the browser instead of crashing. Layer B adds a Haiku desktop-router called from the gateway dispatch step before runDesktopChat, backed by a denormalized chat_thread.summary column.
Tech Stack: TypeScript (strict), Bun, LangGraph 1.3, @langchain/anthropic, Drizzle + Postgres (RLS, 4-role), Hono (clicky-gateway), bun test.
Spec: docs/superpowers/specs/2026-06-12-desktop-durable-thread-router-design.md
Proven root cause: the desktop graph's ack node writes a SystemMessage (the priorTurnSummary note) into the checkpointed ackMessages channel. On a continue-turn the checkpointer restores it at position > 0 and Anthropic's converter throws "System messages are only permitted as the first passed message." (cosmic gateway trace, 2026-06-11, reproduced 3×).
File Structure
| File | Responsibility | State |
|---|---|---|
packages/agent-runtime/src/agents/desktop/graph.ts | ack node: strip system msgs from ackMessages; pass priorTurnSummary as a HumanMessage. Companion wrapper: repair unpaired tool blocks. | Changed |
packages/agent-runtime/src/agents/desktop-ack/graph.ts | llmCall: filter SystemMessage from state.messages before model call. | Changed |
packages/agent-runtime/src/agents/screen-companion/graph.ts | llmCall: same defensive filter. | Changed |
packages/agent-runtime/src/messages/pair-tool-calls.ts | pairToolCalls(messages) — drop a trailing unpaired tool_use. | New |
packages/agent-runtime/src/agents/desktop-router/{prompt,route,index}.ts | Router prompt + one Haiku classification → RouteDecision. | New |
packages/chat-runtime/src/threads.ts | listThreadDigest, updateThreadSummary. | Changed |
packages/db/src/schema/agent.ts + migrations/0037_*.sql | chat_thread.summary column. | Changed / generated |
packages/chat-runtime/src/run.ts | Write summary on turn end; write-interrupt → open-in-browser; error-path config. | Changed |
apps/clicky-gateway/src/dispatch.ts | Call the router, obey the decision. | Changed |
Task 1: pairToolCalls helper (Layer A3)
A trailing tool_use with no matching tool_result makes the next Anthropic turn invalid. This pure helper drops it.
Files: Create packages/agent-runtime/src/messages/pair-tool-calls.ts + .test.ts.
Step 1: Write the failing test
import { describe, expect, it } from "vitest"
import { AIMessage, HumanMessage, ToolMessage } from "@langchain/core/messages"
import { pairToolCalls } from "./pair-tool-calls"
describe("pairToolCalls", () => {
it("keeps a tool_use that has its matching tool_result", () => {
const msgs = [
new HumanMessage("hi"),
new AIMessage({ content: "", tool_calls: [{ id: "t1", name: "x", args: {} }] }),
new ToolMessage({ content: "ok", tool_call_id: "t1", name: "x" }),
]
expect(pairToolCalls(msgs)).toHaveLength(3)
})
it("drops a trailing AIMessage whose tool_use has no tool_result", () => {
const msgs = [
new HumanMessage("hi"),
new AIMessage({ content: "", tool_calls: [{ id: "t1", name: "x", args: {} }] }),
]
const out = pairToolCalls(msgs)
expect(out).toHaveLength(1)
expect(out[0]).toBeInstanceOf(HumanMessage)
})
it("leaves a plain conversation untouched", () => {
const msgs = [new HumanMessage("hi"), new AIMessage("hello")]
expect(pairToolCalls(msgs)).toHaveLength(2)
})
})
Step 2: Run to verify it fails — cd packages/agent-runtime && bun test src/messages/pair-tool-calls.test.ts. Expected: FAIL, Cannot find module './pair-tool-calls'.
Step 3: Implement
import { AIMessage, type BaseMessage, ToolMessage } from "@langchain/core/messages"
/**
* Returns `messages` with any UNPAIRED `tool_use` removed. Anthropic rejects a
* turn whose `tool_use` block has no corresponding `tool_result`; a checkpoint
* restored mid-tool-loop (e.g. an interrupted turn) can leave exactly that.
* Collect every `tool_call_id` answered by a ToolMessage, then drop any
* AIMessage whose tool_calls are not all answered. Order preserved.
*/
export function pairToolCalls(messages: BaseMessage[]): BaseMessage[] {
const answeredIds = new Set<string>()
for (const message of messages) {
if (ToolMessage.isInstance(message) && message.tool_call_id) {
answeredIds.add(message.tool_call_id)
}
}
return messages.filter((message) => {
if (!AIMessage.isInstance(message)) return true
const calls = message.tool_calls ?? []
if (calls.length === 0) return true
return calls.every((call) => call.id != null && answeredIds.has(call.id))
})
}
Step 4: Run to verify it passes — same command, 3 tests PASS.
Step 5: Export — in packages/agent-runtime/src/index.ts add export { pairToolCalls } from "./messages/pair-tool-calls".
Step 6: Commit
git add packages/agent-runtime/src/messages/pair-tool-calls.ts packages/agent-runtime/src/messages/pair-tool-calls.test.ts packages/agent-runtime/src/index.ts
git commit -m "feat(agent): pairToolCalls helper to drop unpaired tool_use blocks"
Task 2: Filter system messages in ack + companion llmCall (Layer A2)
Defense for the whole class: a restored checkpoint can never resurface a mid-history system message.
Files: Modify desktop-ack/graph.ts llmCall + screen-companion/graph.ts llmCall; add a case to desktop-ack/graph.test.ts.
Step 1: Write the failing test (ack) — add to desktop-ack/graph.test.ts (ensure BaseMessage, BaseChatModel, AIMessage, SystemMessage, HumanMessage are imported):
it("does not forward a stray mid-history SystemMessage to the model", async () => {
const seen: BaseMessage[][] = []
const fakeModel = {
bindTools: () => fakeModel,
invoke: async (messages: BaseMessage[]) => {
seen.push(messages)
return new AIMessage({
content: "",
tool_calls: [{ id: "a1", name: "finalize_ack", args: { spokenAck: "ok", intent: "chitchat", openInBrowser: false } }],
})
},
} as unknown as BaseChatModel
const graph = buildDesktopAckGraph({ model: fakeModel })
await graph.invoke({
messages: [new HumanMessage("first"), new SystemMessage("STRAY"), new HumanMessage("second")],
})
const firstCall = seen[0]!
expect(firstCall.filter((m) => m._getType() === "system").length).toBe(1)
expect(firstCall[0]!._getType()).toBe("system")
})
Step 2: Run to verify it fails — cd packages/agent-runtime && bun test src/agents/desktop-ack/graph.test.ts. Expected: FAIL, system count is 2.
Step 3: Implement the filter (ack llmCall)
const llmCall: GraphNode<typeof AgentState> = async (state) => {
const modelWithTools = model.bindTools?.(tools) ?? model
// Defensive: a restored checkpoint must never resurface a SystemMessage at
// position > 0 — Anthropic's converter rejects it. We prepend exactly one
// system prompt, so drop any system message from the prior history.
const priorMessages = state.messages.filter((m) => m._getType() !== "system")
const response = await modelWithTools.invoke([
new SystemMessage(DESKTOP_ACK_PROMPT),
...priorMessages,
])
return { messages: [response] }
}
Step 4: Implement the same filter (companion llmCall)
const llmCall: GraphNode<typeof AgentState> = async (state) => {
const modelWithTools = model.bindTools?.(tools) ?? model
const priorMessages = state.messages.filter((m) => m._getType() !== "system")
const response = await modelWithTools.invoke([
new SystemMessage(SCREEN_COMPANION_PROMPT),
...priorMessages,
])
return { messages: [response] }
}
Step 5: Run tests — bun test src/agents/desktop-ack/graph.test.ts src/agents/screen-companion/graph.test.ts. Expected: PASS.
Step 6: Commit
git add packages/agent-runtime/src/agents/desktop-ack/graph.ts packages/agent-runtime/src/agents/desktop-ack/graph.test.ts packages/agent-runtime/src/agents/screen-companion/graph.ts
git commit -m "fix(agent): filter mid-history system messages before the model call"
Task 3: Stop persisting system messages in ackMessages + summary as Human (Layer A1)
The actual production crash source: summaryNote is a SystemMessage and res.messages (system included) is written to the checkpointed ackMessages.
Files: Modify desktop/graph.ts (ack + companion nodes, imports); add a two-turn case to desktop/graph.test.ts.
Step 1: Write the failing test (two-turn continue, real MemorySaver)
import { MemorySaver } from "@langchain/langgraph"
import { SystemMessage, HumanMessage, AIMessage, BaseMessage } from "@langchain/core/messages"
it("survives a second continue-turn on the same thread (no stray system message)", async () => {
const checkpointer = new MemorySaver()
const ackModel = {
bindTools: () => ackModel,
invoke: async (messages: BaseMessage[]) => {
const strayCount = messages.slice(1).filter((m) => m._getType() === "system").length
if (strayCount > 0) throw new Error("System messages are only permitted as the first passed message.")
return new AIMessage({ content: "", tool_calls: [{ id: "k", name: "finalize_ack", args: { spokenAck: "hi", intent: "new_request", openInBrowser: true } }] })
},
} as unknown as BaseChatModel
const companionModel = {
bindTools: () => companionModel,
invoke: async () => new AIMessage({ content: "", tool_calls: [{ id: "f", name: "finalize_answer", args: { spoken: "done", openInBrowser: true, matter: { kind: "general" } } }] }),
} as unknown as BaseChatModel
const graph = buildDesktopGraph({ checkpointer, ackModel, companionModel })
const cfg = { configurable: { thread_id: "t-continue" } }
await graph.invoke({ messages: [new HumanMessage("turn one")], priorTurnSummary: null }, cfg)
await expect(
graph.invoke({ messages: [new HumanMessage("turn two")], priorTurnSummary: "Last turn: discussed the lease." }, cfg)
).resolves.toBeDefined()
})
Step 2: Run to verify it fails — bun test src/agents/desktop/graph.test.ts. Expected: FAIL, turn two throws the converter error.
Step 3: Implement — summary as HumanMessage + strip system from ackMessages
// ack node: prefix the human turn with the optional prior-turn summary as a
// HUMAN message (never a SystemMessage — a system message persisted into the
// checkpointed ackMessages channel resurfaces at position > 0 on a continue-
// turn and Anthropic rejects it). Strip any system message from the output.
const ack: GraphNode<typeof DesktopState> = async (state) => {
const summaryNote = state.priorTurnSummary
? [new HumanMessage(`Context on the previous turn: ${state.priorTurnSummary}`)]
: []
const res = await ackGraph.invoke({
messages: [...summaryNote, ...state.messages],
})
return {
ackMessages: res.messages.filter((m) => m._getType() !== "system"),
}
}
Update the import: the file imports SystemMessage — replace with HumanMessage (remove SystemMessage if now unused, for lint).
Step 4: Repair unpaired tool blocks in the companion wrapper
const companion: GraphNode<typeof DesktopState> = async (state) => {
// Restored history may carry a trailing unpaired tool_use (an interrupted
// prior turn). Drop it so the companion's first model call is valid.
const safeMessages = pairToolCalls(state.messages)
const res = await companionGraph.invoke({ messages: safeMessages })
return { messages: res.messages.slice(safeMessages.length) }
}
Add import: import { pairToolCalls } from "../../messages/pair-tool-calls".
Step 5: Run to verify it passes — both turns resolve.
Step 6: Commit
git add packages/agent-runtime/src/agents/desktop/graph.ts packages/agent-runtime/src/agents/desktop/graph.test.ts
git commit -m "fix(agent): desktop path survives continue-turns (no persisted system message)"
Task 4: chat_thread.summary column + migration (Layer B6)
Files: Modify packages/db/src/schema/agent.ts; generate migrations/0037_*.sql (+ snapshot + journal).
Step 1: Add the column — in chatThread's columns, after pendingDeepLink:
// Denormalized one-line summary of the thread's last substantive answer,
// written from finalize_answer.spoken at desktop turn end. Powers the
// desktop-router's last-10-threads digest and the history list. Nullable.
summary: text("summary"),
Step 2: Generate the migration — bun db:generate. Expected: 0037_*.sql with ALTER TABLE "chat_thread" ADD COLUMN "summary" text; + new snapshot + journal entry.
Step 3: Verify the journal when watermark — head is idx:36, when:1781395811000. If 0037's when is not strictly greater, bump ONLY that when to 1781395812000 (metadata only — never the SQL). Drizzle-kit skips migrations below the DB watermark (CLAUDE.md note).
Step 4: Review the SQL — it must be exactly the ADD COLUMN; no RLS, no GRANT, no other table. If not, fix the schema TS and re-run bun db:generate (never hand-edit SQL).
Step 5: Commit schema + migration + snapshot together
git add packages/db/src/schema/agent.ts packages/db/src/migrations/0037_*.sql packages/db/src/migrations/meta/
git commit -m "feat(db): add chat_thread.summary column for the thread-router digest"
Task 5: listThreadDigest + updateThreadSummary (Layer B6)
Files: Modify packages/chat-runtime/src/threads.ts (+ index export); add tests to threads.test.ts.
Step 1: Write the failing test — contract test for the type; if the file has a DB harness, ADD a real test (insert 11 threads for the user → expect length 10, newest-first; a thread by another user/org is excluded — use the harness's insert helpers, don't invent DB setup):
import { describe, expect, it } from "vitest"
import type { ThreadDigestEntry } from "./threads"
describe("ThreadDigestEntry shape", () => {
it("carries id, title, summary, lastActivityAt", () => {
const entry: ThreadDigestEntry = {
id: "thr_1", title: "Lease review",
summary: "Discussed the rent escalation clause.",
lastActivityAt: new Date("2026-06-12T10:00:00Z"),
}
expect(entry.summary).toBe("Discussed the rent escalation clause.")
})
})
Step 2: Run to verify it fails — cd packages/chat-runtime && bun test src/threads.test.ts. Expected: FAIL, ThreadDigestEntry not exported.
Step 3: Implement (near listThreads; reuses withRlsTransaction, and, eq, desc, chatThread already imported)
/** One entry in the desktop-router's recent-threads digest. */
export interface ThreadDigestEntry {
id: string
title: string
summary: string | null
lastActivityAt: Date
}
/**
* The caller's most-recent threads (newest first), capped, for the router.
* One row per thread — no message-table fan-out (summary is denormalized).
* Firm + created_by scoped, exactly like listThreads.
*/
export async function listThreadDigest(
organizationId: string,
userId: string,
limit = 10
): Promise<ThreadDigestEntry[]> {
return withRlsTransaction(db, organizationId, async (tx) => {
const rows = await tx
.select({
id: chatThread.id,
title: chatThread.title,
summary: chatThread.summary,
updatedAt: chatThread.updatedAt,
})
.from(chatThread)
.where(
and(
eq(chatThread.organizationId, organizationId),
eq(chatThread.createdBy, userId)
)
)
.orderBy(desc(chatThread.updatedAt))
.limit(limit)
return rows.map((r) => ({
id: r.id,
title: r.title,
summary: r.summary,
lastActivityAt: r.updatedAt,
}))
})
}
/**
* Writes the denormalized one-line summary for a thread the caller owns.
* Silent no-op if not the caller's, mirroring updateThreadModel.
*/
export async function updateThreadSummary(
organizationId: string,
userId: string,
threadId: string,
summary: string
): Promise<void> {
await withRlsTransaction(db, organizationId, async (tx) => {
await tx
.update(chatThread)
.set({ summary })
.where(
and(
eq(chatThread.id, threadId),
eq(chatThread.organizationId, organizationId),
eq(chatThread.createdBy, userId)
)
)
})
}
Step 4: Export — in packages/chat-runtime/src/index.ts, export listThreadDigest, updateThreadSummary, and the ThreadDigestEntry type (match the file's existing export style; if it uses export * from "./threads", no edit needed — verify).
Step 5: Run tests — PASS.
Step 6: Commit
git add packages/chat-runtime/src/threads.ts packages/chat-runtime/src/threads.test.ts packages/chat-runtime/src/index.ts
git commit -m "feat(chat-runtime): listThreadDigest + updateThreadSummary for the router"
Task 6: desktop-router agent — prompt + route (Layer B1–B5, B8)
Files: Create desktop-router/{prompt,route,index}.ts + route.test.ts; export from the package index.
Step 1: Write the prompt — desktop-router/prompt.ts
export const DESKTOP_ROUTER_PROMPT = `You route a lawyer's spoken request to the right conversation thread on their desktop assistant. You do NOT answer the request — you only pick the thread.
You receive: the new spoken request, an optional note about the thread they were just on (the "active thread"), and a digest of their last ~10 threads (id, title, one-line summary).
Call route EXACTLY ONCE with one of:
- kind "continue": the request is a follow-up to the ACTIVE thread's topic. Default when there is an active thread and the new request clearly continues it.
- kind "resume": the request unambiguously refers to an OLDER thread in the digest (e.g. "pull up that Brooklyn Gardens lease from yesterday"). Set threadId to that thread's id.
- kind "new": a new topic, OR there is no active thread, OR you are at all unsure. When in doubt, choose "new" — mis-routing a conversation into the wrong thread is worse than starting a fresh one.
Be conservative. Only "resume" on a clear, specific match to a digest entry. Only "continue" when the topic genuinely carries over. Otherwise "new".
Respond ONLY by calling route. No prose.`
Step 2: Write the failing test — desktop-router/route.test.ts
import { describe, expect, it } from "vitest"
import { AIMessage } from "@langchain/core/messages"
import type { BaseChatModel } from "@langchain/core/language_models/chat_models"
import { routeDesktopTurn } from "./route"
function modelReturning(args: unknown): BaseChatModel {
const m = {
bindTools: () => m,
invoke: async () =>
new AIMessage({ content: "", tool_calls: [{ id: "r", name: "route", args: args as Record<string, unknown> }] }),
} as unknown as BaseChatModel
return m
}
const digest = [
{ id: "thr_active", title: "145 W 42nd deal", summary: "Reviewed the PSA.", lastActivityAt: new Date() },
{ id: "thr_old", title: "Brooklyn Gardens lease", summary: "Rent escalation.", lastActivityAt: new Date() },
]
describe("routeDesktopTurn", () => {
it("returns continue with the active thread id", async () => {
expect(await routeDesktopTurn({ request: "and the closing date?", activeThreadId: "thr_active", digest, model: modelReturning({ kind: "continue" }) }))
.toEqual({ kind: "continue", threadId: "thr_active" })
})
it("returns resume with the referenced older thread id", async () => {
expect(await routeDesktopTurn({ request: "pull up that Brooklyn Gardens lease", activeThreadId: "thr_active", digest, model: modelReturning({ kind: "resume", threadId: "thr_old" }) }))
.toEqual({ kind: "resume", threadId: "thr_old" })
})
it("returns new on a fresh topic", async () => {
expect(await routeDesktopTurn({ request: "draft an NDA", activeThreadId: "thr_active", digest, model: modelReturning({ kind: "new" }) }))
.toEqual({ kind: "new" })
})
it("continue with no active thread is incoherent -> new", async () => {
expect(await routeDesktopTurn({ request: "hello", activeThreadId: undefined, digest, model: modelReturning({ kind: "continue" }) }))
.toEqual({ kind: "new" })
})
it("resume to a thread not in the digest -> safe default", async () => {
expect(await routeDesktopTurn({ request: "x", activeThreadId: "thr_active", digest, model: modelReturning({ kind: "resume", threadId: "thr_ghost" }) }))
.toEqual({ kind: "continue", threadId: "thr_active" })
})
it("falls back to new on malformed output", async () => {
expect(await routeDesktopTurn({ request: "x", activeThreadId: "thr_active", digest, model: modelReturning({ kind: "garbage" }) }))
.toEqual({ kind: "continue", threadId: "thr_active" })
})
it("falls back (continue active) when the model throws", async () => {
const throwing = { bindTools: () => throwing, invoke: async () => { throw new Error("timeout") } } as unknown as BaseChatModel
expect(await routeDesktopTurn({ request: "x", activeThreadId: "thr_active", digest, model: throwing }))
.toEqual({ kind: "continue", threadId: "thr_active" })
})
})
Note: the safe default for a malformed/missing-route output is "continue active if present, else new" — that is why the "garbage" and "ghost" cases resolve to continue thr_active, and the no-active cases resolve to new.
Step 3: Run to verify it fails — bun test src/agents/desktop-router/route.test.ts. Expected: FAIL, module not found.
Step 4: Implement route.ts
import type { BaseChatModel } from "@langchain/core/language_models/chat_models"
import { AIMessage, HumanMessage, SystemMessage } from "@langchain/core/messages"
import { tool } from "@langchain/core/tools"
import { z } from "zod"
import { createModel } from "../../model"
import { DESKTOP_ROUTER_PROMPT } from "./prompt"
export interface RouterDigestEntry {
id: string
title: string
summary: string | null
lastActivityAt: Date
}
export type RouteDecision =
| { kind: "continue"; threadId: string }
| { kind: "resume"; threadId: string }
| { kind: "new" }
const routeArgsSchema = z.object({
kind: z.enum(["continue", "resume", "new"]),
threadId: z.string().min(1).optional(),
})
const routeTool = tool(async () => "routed", {
name: "route",
description:
"Pick the thread for this turn. 'continue' = follow-up to the active thread; " +
"'resume' = an older thread via threadId; 'new' = a new topic or when unsure.",
schema: routeArgsSchema,
})
export interface RouteDesktopTurnInput {
request: string
activeThreadId?: string
digest: RouterDigestEntry[]
screenshot?: string
model?: BaseChatModel
}
/**
* One Haiku classification → a thread RouteDecision. NEVER throws: any model
* error / timeout / malformed output degrades to the safe default (continue
* the active thread if there is one, else new) so a routing outage is exactly
* today's continuity, never a crash.
*/
export async function routeDesktopTurn(
input: RouteDesktopTurnInput
): Promise<RouteDecision> {
const safeDefault = (): RouteDecision =>
input.activeThreadId
? { kind: "continue", threadId: input.activeThreadId }
: { kind: "new" }
try {
const model = input.model ?? createModel("claude-haiku-4-5")
const modelWithTools = model.bindTools?.([routeTool]) ?? model
const digestLines = input.digest
.map((entry, i) => `${i + 1}. id=${entry.id} | title="${entry.title}" | summary="${entry.summary ?? "(none)"}"`)
.join("\n")
const activeNote = input.activeThreadId
? `Active thread id (the one they were just on): ${input.activeThreadId}`
: "There is no active thread."
const humanText =
`${activeNote}\n\nRecent threads:\n${digestLines || "(none)"}\n\n` +
`New spoken request: ${input.request}`
const humanContent: HumanMessage["content"] = input.screenshot
? [
{ type: "text", text: humanText },
{ type: "image_url", image_url: { url: `data:image/jpeg;base64,${input.screenshot}` } },
]
: humanText
const response = await modelWithTools.invoke([
new SystemMessage(DESKTOP_ROUTER_PROMPT),
new HumanMessage({ content: humanContent }),
])
const call = AIMessage.isInstance(response)
? response.tool_calls?.find((c) => c.name === "route")
: undefined
if (!call) return safeDefault()
const parsed = routeArgsSchema.safeParse(call.args)
if (!parsed.success) return safeDefault()
const { kind, threadId } = parsed.data
if (kind === "new") return { kind: "new" }
if (kind === "continue") {
return input.activeThreadId
? { kind: "continue", threadId: input.activeThreadId }
: { kind: "new" }
}
// resume: require a threadId that exists in the digest.
if (threadId && input.digest.some((entry) => entry.id === threadId)) {
return { kind: "resume", threadId }
}
return safeDefault()
} catch {
return safeDefault()
}
}
Step 5: Create the index — desktop-router/index.ts
export { routeDesktopTurn } from "./route"
export type { RouteDecision, RouteDesktopTurnInput, RouterDigestEntry } from "./route"
export { DESKTOP_ROUTER_PROMPT } from "./prompt"
Step 6: Export from the package index — in packages/agent-runtime/src/index.ts:
export {
routeDesktopTurn,
type RouteDecision,
type RouteDesktopTurnInput,
type RouterDigestEntry,
} from "./agents/desktop-router"
Step 7: Run tests — 7 tests PASS.
Step 8: Commit
git add packages/agent-runtime/src/agents/desktop-router/ packages/agent-runtime/src/index.ts
git commit -m "feat(agent): desktop-router agent — routes a voice turn to continue/resume/new"
Task 7: Wire the router into the gateway dispatch (Layer B2, B7)
Files: Modify apps/clicky-gateway/src/dispatch.ts; add cases to dispatch.test.ts (mirror the existing seam-based tests).
Step 1: Write the failing tests — add a routeDesktopTurnImpl + listThreadDigestImpl seam to the test calls:
it("obeys a resume decision: passes the older thread id to runDesktopChat", async () => {
let seenThreadId: string | undefined
await dispatch({
organizationId: "org_1", userId: "user_1",
input: { request: "pull up Brooklyn Gardens", threadId: "thr_active" },
routeDesktopTurnImpl: async () => ({ kind: "resume", threadId: "thr_old" }),
listThreadDigestImpl: async () => [],
runDesktopChatImpl: async (args) => { seenThreadId = args.threadId; return new Response("ok") },
})
expect(seenThreadId).toBe("thr_old")
})
it("obeys a new decision: creates a thread and passes the new id", async () => {
let seenThreadId: string | undefined
await dispatch({
organizationId: "org_1", userId: "user_1",
input: { request: "draft an NDA", threadId: "thr_active" },
routeDesktopTurnImpl: async () => ({ kind: "new" }),
listThreadDigestImpl: async () => [],
createThreadImpl: async () => "thr_created",
runDesktopChatImpl: async (args) => { seenThreadId = args.threadId; return new Response("ok") },
})
expect(seenThreadId).toBe("thr_created")
})
it("obeys a continue decision: keeps the active thread", async () => {
let seenThreadId: string | undefined
await dispatch({
organizationId: "org_1", userId: "user_1",
input: { request: "and the closing date?", threadId: "thr_active" },
routeDesktopTurnImpl: async () => ({ kind: "continue", threadId: "thr_active" }),
listThreadDigestImpl: async () => [],
runDesktopChatImpl: async (args) => { seenThreadId = args.threadId; return new Response("ok") },
})
expect(seenThreadId).toBe("thr_active")
})
Step 2: Run to verify it fails — cd apps/clicky-gateway && bun test src/dispatch.test.ts. Expected: FAIL, seams not accepted / router not wired.
Step 3: Implement the wiring — imports:
import { createThread, runDesktopChat, listThreadDigest } from "@workspace/chat-runtime"
import { routeDesktopTurn } from "@workspace/agent-runtime"
import type { RouteDecision } from "@workspace/agent-runtime"
Add seams to DispatchInput:
routeDesktopTurnImpl?: typeof routeDesktopTurn
listThreadDigestImpl?: typeof listThreadDigest
Replace the thread-resolution block in dispatch(...):
const createThreadFn = req.createThreadImpl ?? createThread
const routeFn = req.routeDesktopTurnImpl ?? routeDesktopTurn
const digestFn = req.listThreadDigestImpl ?? listThreadDigest
// Router pre-step (spec Layer B): decide which thread this turn belongs to.
// The router NEVER throws — a failure degrades to continue-active-or-new.
const digest = await digestFn(req.organizationId, req.userId, 10)
const decision: RouteDecision = await routeFn({
request: req.input.request,
activeThreadId: req.input.threadId,
digest: digest.map((entry) => ({
id: entry.id, title: entry.title,
summary: entry.summary, lastActivityAt: entry.lastActivityAt,
})),
screenshot: req.input.screenshot,
})
const threadId =
decision.kind === "new"
? await createThreadFn({
organizationId: req.organizationId,
userId: req.userId,
agentId: DESKTOP_SUPERVISOR_AGENT_ID,
title: req.input.request.slice(0, 80) || "Clicky",
})
: decision.threadId
console.log(
`[clicky-gateway] dispatch routed: mode=interactive agent=${DESKTOP_SUPERVISOR_AGENT_ID}` +
` route=${decision.kind} thread=${threadId}` +
` screenshot=${req.input.screenshot ? "present" : "none"}`
)
Leave the runDesktopChatFn call below unchanged — it already receives threadId.
Step 4: Run tests — existing + 3 new PASS.
Step 5: Commit
git add apps/clicky-gateway/src/dispatch.ts apps/clicky-gateway/src/dispatch.test.ts
git commit -m "feat(gateway): route each voice turn via desktop-router before dispatch"
Task 8: Write thread summary on turn end + open drafts in the browser (Layer A4, B6)
Files: Modify packages/chat-runtime/src/run.ts (the runDesktopChat companion-completion block); add cases to run.test.ts via the existing test seams.
Step 1: Write the failing test (summary written) — add an updateThreadSummaryImpl seam; use the file's existing fake-graph + drain helpers:
it("writes the thread summary from finalize_answer.spoken on completion", async () => {
let savedSummary: string | undefined
const res = await runDesktopChat({
organizationId: "org_1", userId: "user_1", threadId: "thr_1", agentId: "desktop",
request: "give me context",
getAgentImpl: makeFakeDesktopGraph({ spoken: "The PSA is signed.", intent: "new_request" }),
updateThreadSummaryImpl: async (_o, _u, _t, summary) => { savedSummary = summary },
...baseFakes,
})
await drain(res)
expect(savedSummary).toBe("The PSA is signed.")
})
Step 2: Write the failing test (draft interrupt → open in browser)
it("opens the draft in the browser when the companion interrupts on a write", async () => {
const res = await runDesktopChat({
organizationId: "org_1", userId: "user_1", threadId: "thr_1", agentId: "desktop",
request: "draft an NDA",
getAgentImpl: makeFakeDesktopGraphWithPendingInterrupt(),
...baseFakes,
})
const final = await readFinalChunk(res)
expect(final.openInBrowser).toBe(true)
expect(final.deepLink).toContain("/tasks/chats/thr_1")
expect(final.spoken.toLowerCase()).toContain("draft")
})
Helpers: if makeFakeDesktopGraph / makeFakeDesktopGraphWithPendingInterrupt / drain / readFinalChunk don't exist, build them by mirroring the existing desktop fake-graph + stream-reading already in run.test.ts. The pending-interrupt fake's getState returns { values: { messages }, tasks: [{ interrupts: [{ value: { kind: "document_write_confirm" } }] }] } and its stream emits a companion message then ends WITHOUT a finalize_answer.
Step 3: Run to verify they fail — cd packages/chat-runtime && bun test src/run.test.ts.
Step 4: Add the seam + summary write — import updateThreadSummary from ./threads; add to RunDesktopChatInput: updateThreadSummaryImpl?: typeof updateThreadSummary; resolve const updateThreadSummaryFn = input.updateThreadSummaryImpl ?? updateThreadSummary. In the completion block, after harvestFinalizeAnswer + prose persistence:
// Denormalize the spoken summary onto the thread for the router's
// digest (best-effort — a summary write must never fail the turn).
const summaryText = finalize?.spoken ?? companionProse.slice(0, 280)
if (summaryText.length > 0) {
try {
await updateThreadSummaryFn(
input.organizationId, input.userId, input.threadId,
summaryText.slice(0, 280)
)
} catch (error) {
console.error(`Failed to write thread summary for ${input.threadId}`, error)
}
}
Step 5: Detect the pending write-interrupt → open in browser — after const state = await graph.getState(...), BEFORE the normal-final emit:
// Layer A4: a draft / write tool pauses the graph on an in-tool
// interrupt (document_write_confirm / document_edit_review). Voice
// has no confirm card, so we DON'T commit the write — we open the
// thread in the browser (where the confirm/edit UI lives) and speak a
// short line. Detected by a pending task interrupt with no
// finalize_answer harvested.
const hasPendingInterrupt =
!finalize &&
Array.isArray((state as { tasks?: unknown[] }).tasks) &&
(state as { tasks: { interrupts?: unknown[] }[] }).tasks.some(
(task) => (task.interrupts?.length ?? 0) > 0
)
if (hasPendingInterrupt) {
const draftDeepLink = buildDeepLink(webBase, input.threadId, null)
await setPendingDeepLinkFn(
input.organizationId, input.userId, input.threadId, draftDeepLink
)
emit(buildClickyFinalChunk({
spoken: "I've drafted that — opening North so you can review and save it.",
openInBrowser: true,
deepLink: draftDeepLink,
threadId: input.threadId,
}))
finalSeen = true
emit({ type: "finish" })
controller.close()
return
}
emit, finalSeen, controller, setPendingDeepLinkFn, buildDeepLink, webBase are all already in scope in this block.
Step 6: Run tests — PASS (summary written; draft opens in browser).
Step 7: Commit
git add packages/chat-runtime/src/run.ts packages/chat-runtime/src/run.test.ts
git commit -m "feat(chat-runtime): write thread summary on turn end; open drafts in the browser"
Task 9: Harden the error-path checkpoint config (Layer A5)
Files: Modify packages/chat-runtime/src/run.ts (catch + any getState).
Step 1: Audit — confirm BOTH graph.stream(...) (≈ line 777) and the completion graph.getState(...) (≈ line 944) pass configurable.thread_id: input.threadId (they do). There must be NO bare getState() in the error handling.
Step 2: Guard test — a desktop run whose graph throws still closes with a parseable fallback final and never reads state config-less:
it("a throwing desktop graph emits a fallback final", async () => {
const res = await runDesktopChat({
organizationId: "org_1", userId: "user_1", threadId: "thr_1", agentId: "desktop",
request: "boom",
getAgentImpl: makeThrowingDesktopGraph(), // stream() throws after start
...baseFakes,
})
const final = await readFinalChunk(res)
expect(final.spoken).toContain("something went wrong")
})
Step 3: Run the test — PASS (existing 64f47bf fallback preserved; no NULL-thread_id state read).
Step 4: Commit
git add packages/chat-runtime/src/run.ts packages/chat-runtime/src/run.test.ts
git commit -m "fix(chat-runtime): desktop error path never reads checkpoint state without thread_id"
Task 10: Full verification + docs
Step 1: Typecheck + lint from root — bun run typecheck && bun run lint. Expected: PASS.
Step 2: Touched-package test suites (export surfaces changed → full suites per CLAUDE.md):
cd packages/agent-runtime && bun test
cd ../chat-runtime && bun test
cd ../../apps/clicky-gateway && bun test
Step 3: Apply the migration locally
bun run db:up
bun run db:migrate
bun --filter=@workspace/db provision
Step 4: Update CLAUDE.md — under "Agent runtime + skills", add desktop-router to the agent list (one line: routes voice turns to continue/resume/new).
Step 5: Commit docs — git add CLAUDE.md && git commit -m "docs: note the desktop-router agent in the runtime overview".
Step 6: Push — git push.
Notes for the executor
- Server-side caveat: every change is server-side (gateway + runtimes). To take effect on the live voice path, the cosmic clicky-gateway must be redeployed (
ssh cosmic && cd ~/workspace/north-os && git pull && <restart clicky-gateway>). Polaris needs no change in this plan. - Migrations: Task 4 is the only schema change. NEVER hand-write —
bun db:generateonly. Watch the journalwhenwatermark (head1781395811000). - Model id: the router and ack use
createModel("claude-haiku-4-5")— match the existing ack graph's call exactly. - Bun only. All test commands are
bun test.