← plans · north_os · DESIGN.md12 juin 2026 à 01:06

Durable Desktop Voice Path + Topic-Aware Thread Router

For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (<input type=checkbox>) for tracking.

Context

Goal: Make the desktop voice path survive multi-turn conversations (it currently crashes on the second thread=continue turn) and add a desktop-router agent that routes each voice turn to continue / resume-older / new thread using a digest of the last 10 threads.

Architecture: Layer A strips control-plane system messages out of checkpointed channels, defensively filters mid-history system messages before every model call, repairs unpaired tool blocks across turns, and opens drafts in the browser instead of crashing. Layer B adds a Haiku desktop-router called from the gateway dispatch step before runDesktopChat, backed by a denormalized chat_thread.summary column.

Tech Stack: TypeScript (strict), Bun, LangGraph 1.3, @langchain/anthropic, Drizzle + Postgres (RLS, 4-role), Hono (clicky-gateway), bun test.

Spec: docs/superpowers/specs/2026-06-12-desktop-durable-thread-router-design.md

Proven root cause: the desktop graph's ack node writes a SystemMessage (the priorTurnSummary note) into the checkpointed ackMessages channel. On a continue-turn the checkpointer restores it at position > 0 and Anthropic's converter throws "System messages are only permitted as the first passed message." (cosmic gateway trace, 2026-06-11, reproduced 3×).

File Structure

FileResponsibilityState
packages/agent-runtime/src/agents/desktop/graph.tsack node: strip system msgs from ackMessages; pass priorTurnSummary as a HumanMessage. Companion wrapper: repair unpaired tool blocks.Changed
packages/agent-runtime/src/agents/desktop-ack/graph.tsllmCall: filter SystemMessage from state.messages before model call.Changed
packages/agent-runtime/src/agents/screen-companion/graph.tsllmCall: same defensive filter.Changed
packages/agent-runtime/src/messages/pair-tool-calls.tspairToolCalls(messages) — drop a trailing unpaired tool_use.New
packages/agent-runtime/src/agents/desktop-router/{prompt,route,index}.tsRouter prompt + one Haiku classification → RouteDecision.New
packages/chat-runtime/src/threads.tslistThreadDigest, updateThreadSummary.Changed
packages/db/src/schema/agent.ts + migrations/0037_*.sqlchat_thread.summary column.Changed / generated
packages/chat-runtime/src/run.tsWrite summary on turn end; write-interrupt → open-in-browser; error-path config.Changed
apps/clicky-gateway/src/dispatch.tsCall the router, obey the decision.Changed

Task 1: pairToolCalls helper (Layer A3)

A trailing tool_use with no matching tool_result makes the next Anthropic turn invalid. This pure helper drops it.

Files: Create packages/agent-runtime/src/messages/pair-tool-calls.ts + .test.ts.

Step 1: Write the failing test

import { describe, expect, it } from "vitest"
import { AIMessage, HumanMessage, ToolMessage } from "@langchain/core/messages"
import { pairToolCalls } from "./pair-tool-calls"

describe("pairToolCalls", () => {
  it("keeps a tool_use that has its matching tool_result", () => {
    const msgs = [
      new HumanMessage("hi"),
      new AIMessage({ content: "", tool_calls: [{ id: "t1", name: "x", args: {} }] }),
      new ToolMessage({ content: "ok", tool_call_id: "t1", name: "x" }),
    ]
    expect(pairToolCalls(msgs)).toHaveLength(3)
  })

  it("drops a trailing AIMessage whose tool_use has no tool_result", () => {
    const msgs = [
      new HumanMessage("hi"),
      new AIMessage({ content: "", tool_calls: [{ id: "t1", name: "x", args: {} }] }),
    ]
    const out = pairToolCalls(msgs)
    expect(out).toHaveLength(1)
    expect(out[0]).toBeInstanceOf(HumanMessage)
  })

  it("leaves a plain conversation untouched", () => {
    const msgs = [new HumanMessage("hi"), new AIMessage("hello")]
    expect(pairToolCalls(msgs)).toHaveLength(2)
  })
})

Step 2: Run to verify it failscd packages/agent-runtime && bun test src/messages/pair-tool-calls.test.ts. Expected: FAIL, Cannot find module './pair-tool-calls'.

Step 3: Implement

import { AIMessage, type BaseMessage, ToolMessage } from "@langchain/core/messages"

/**
 * Returns `messages` with any UNPAIRED `tool_use` removed. Anthropic rejects a
 * turn whose `tool_use` block has no corresponding `tool_result`; a checkpoint
 * restored mid-tool-loop (e.g. an interrupted turn) can leave exactly that.
 * Collect every `tool_call_id` answered by a ToolMessage, then drop any
 * AIMessage whose tool_calls are not all answered. Order preserved.
 */
export function pairToolCalls(messages: BaseMessage[]): BaseMessage[] {
  const answeredIds = new Set<string>()
  for (const message of messages) {
    if (ToolMessage.isInstance(message) && message.tool_call_id) {
      answeredIds.add(message.tool_call_id)
    }
  }
  return messages.filter((message) => {
    if (!AIMessage.isInstance(message)) return true
    const calls = message.tool_calls ?? []
    if (calls.length === 0) return true
    return calls.every((call) => call.id != null && answeredIds.has(call.id))
  })
}

Step 4: Run to verify it passes — same command, 3 tests PASS.

Step 5: Export — in packages/agent-runtime/src/index.ts add export { pairToolCalls } from "./messages/pair-tool-calls".

Step 6: Commit

git add packages/agent-runtime/src/messages/pair-tool-calls.ts packages/agent-runtime/src/messages/pair-tool-calls.test.ts packages/agent-runtime/src/index.ts
git commit -m "feat(agent): pairToolCalls helper to drop unpaired tool_use blocks"

Task 2: Filter system messages in ack + companion llmCall (Layer A2)

Defense for the whole class: a restored checkpoint can never resurface a mid-history system message.

Files: Modify desktop-ack/graph.ts llmCall + screen-companion/graph.ts llmCall; add a case to desktop-ack/graph.test.ts.

Step 1: Write the failing test (ack) — add to desktop-ack/graph.test.ts (ensure BaseMessage, BaseChatModel, AIMessage, SystemMessage, HumanMessage are imported):

it("does not forward a stray mid-history SystemMessage to the model", async () => {
  const seen: BaseMessage[][] = []
  const fakeModel = {
    bindTools: () => fakeModel,
    invoke: async (messages: BaseMessage[]) => {
      seen.push(messages)
      return new AIMessage({
        content: "",
        tool_calls: [{ id: "a1", name: "finalize_ack", args: { spokenAck: "ok", intent: "chitchat", openInBrowser: false } }],
      })
    },
  } as unknown as BaseChatModel
  const graph = buildDesktopAckGraph({ model: fakeModel })
  await graph.invoke({
    messages: [new HumanMessage("first"), new SystemMessage("STRAY"), new HumanMessage("second")],
  })
  const firstCall = seen[0]!
  expect(firstCall.filter((m) => m._getType() === "system").length).toBe(1)
  expect(firstCall[0]!._getType()).toBe("system")
})

Step 2: Run to verify it failscd packages/agent-runtime && bun test src/agents/desktop-ack/graph.test.ts. Expected: FAIL, system count is 2.

Step 3: Implement the filter (ack llmCall)

  const llmCall: GraphNode<typeof AgentState> = async (state) => {
    const modelWithTools = model.bindTools?.(tools) ?? model
    // Defensive: a restored checkpoint must never resurface a SystemMessage at
    // position > 0 — Anthropic's converter rejects it. We prepend exactly one
    // system prompt, so drop any system message from the prior history.
    const priorMessages = state.messages.filter((m) => m._getType() !== "system")
    const response = await modelWithTools.invoke([
      new SystemMessage(DESKTOP_ACK_PROMPT),
      ...priorMessages,
    ])
    return { messages: [response] }
  }

Step 4: Implement the same filter (companion llmCall)

  const llmCall: GraphNode<typeof AgentState> = async (state) => {
    const modelWithTools = model.bindTools?.(tools) ?? model
    const priorMessages = state.messages.filter((m) => m._getType() !== "system")
    const response = await modelWithTools.invoke([
      new SystemMessage(SCREEN_COMPANION_PROMPT),
      ...priorMessages,
    ])
    return { messages: [response] }
  }

Step 5: Run testsbun test src/agents/desktop-ack/graph.test.ts src/agents/screen-companion/graph.test.ts. Expected: PASS.

Step 6: Commit

git add packages/agent-runtime/src/agents/desktop-ack/graph.ts packages/agent-runtime/src/agents/desktop-ack/graph.test.ts packages/agent-runtime/src/agents/screen-companion/graph.ts
git commit -m "fix(agent): filter mid-history system messages before the model call"

Task 3: Stop persisting system messages in ackMessages + summary as Human (Layer A1)

The actual production crash source: summaryNote is a SystemMessage and res.messages (system included) is written to the checkpointed ackMessages.

Files: Modify desktop/graph.ts (ack + companion nodes, imports); add a two-turn case to desktop/graph.test.ts.

Step 1: Write the failing test (two-turn continue, real MemorySaver)

import { MemorySaver } from "@langchain/langgraph"
import { SystemMessage, HumanMessage, AIMessage, BaseMessage } from "@langchain/core/messages"

it("survives a second continue-turn on the same thread (no stray system message)", async () => {
  const checkpointer = new MemorySaver()
  const ackModel = {
    bindTools: () => ackModel,
    invoke: async (messages: BaseMessage[]) => {
      const strayCount = messages.slice(1).filter((m) => m._getType() === "system").length
      if (strayCount > 0) throw new Error("System messages are only permitted as the first passed message.")
      return new AIMessage({ content: "", tool_calls: [{ id: "k", name: "finalize_ack", args: { spokenAck: "hi", intent: "new_request", openInBrowser: true } }] })
    },
  } as unknown as BaseChatModel
  const companionModel = {
    bindTools: () => companionModel,
    invoke: async () => new AIMessage({ content: "", tool_calls: [{ id: "f", name: "finalize_answer", args: { spoken: "done", openInBrowser: true, matter: { kind: "general" } } }] }),
  } as unknown as BaseChatModel

  const graph = buildDesktopGraph({ checkpointer, ackModel, companionModel })
  const cfg = { configurable: { thread_id: "t-continue" } }
  await graph.invoke({ messages: [new HumanMessage("turn one")], priorTurnSummary: null }, cfg)
  await expect(
    graph.invoke({ messages: [new HumanMessage("turn two")], priorTurnSummary: "Last turn: discussed the lease." }, cfg)
  ).resolves.toBeDefined()
})

Step 2: Run to verify it failsbun test src/agents/desktop/graph.test.ts. Expected: FAIL, turn two throws the converter error.

Step 3: Implement — summary as HumanMessage + strip system from ackMessages

  // ack node: prefix the human turn with the optional prior-turn summary as a
  // HUMAN message (never a SystemMessage — a system message persisted into the
  // checkpointed ackMessages channel resurfaces at position > 0 on a continue-
  // turn and Anthropic rejects it). Strip any system message from the output.
  const ack: GraphNode<typeof DesktopState> = async (state) => {
    const summaryNote = state.priorTurnSummary
      ? [new HumanMessage(`Context on the previous turn: ${state.priorTurnSummary}`)]
      : []
    const res = await ackGraph.invoke({
      messages: [...summaryNote, ...state.messages],
    })
    return {
      ackMessages: res.messages.filter((m) => m._getType() !== "system"),
    }
  }

Update the import: the file imports SystemMessage — replace with HumanMessage (remove SystemMessage if now unused, for lint).

Step 4: Repair unpaired tool blocks in the companion wrapper

  const companion: GraphNode<typeof DesktopState> = async (state) => {
    // Restored history may carry a trailing unpaired tool_use (an interrupted
    // prior turn). Drop it so the companion's first model call is valid.
    const safeMessages = pairToolCalls(state.messages)
    const res = await companionGraph.invoke({ messages: safeMessages })
    return { messages: res.messages.slice(safeMessages.length) }
  }

Add import: import { pairToolCalls } from "../../messages/pair-tool-calls".

Step 5: Run to verify it passes — both turns resolve.

Step 6: Commit

git add packages/agent-runtime/src/agents/desktop/graph.ts packages/agent-runtime/src/agents/desktop/graph.test.ts
git commit -m "fix(agent): desktop path survives continue-turns (no persisted system message)"

Task 4: chat_thread.summary column + migration (Layer B6)

Files: Modify packages/db/src/schema/agent.ts; generate migrations/0037_*.sql (+ snapshot + journal).

Step 1: Add the column — in chatThread's columns, after pendingDeepLink:

    // Denormalized one-line summary of the thread's last substantive answer,
    // written from finalize_answer.spoken at desktop turn end. Powers the
    // desktop-router's last-10-threads digest and the history list. Nullable.
    summary: text("summary"),

Step 2: Generate the migrationbun db:generate. Expected: 0037_*.sql with ALTER TABLE "chat_thread" ADD COLUMN "summary" text; + new snapshot + journal entry.

Step 3: Verify the journal when watermark — head is idx:36, when:1781395811000. If 0037's when is not strictly greater, bump ONLY that when to 1781395812000 (metadata only — never the SQL). Drizzle-kit skips migrations below the DB watermark (CLAUDE.md note).

Step 4: Review the SQL — it must be exactly the ADD COLUMN; no RLS, no GRANT, no other table. If not, fix the schema TS and re-run bun db:generate (never hand-edit SQL).

Step 5: Commit schema + migration + snapshot together

git add packages/db/src/schema/agent.ts packages/db/src/migrations/0037_*.sql packages/db/src/migrations/meta/
git commit -m "feat(db): add chat_thread.summary column for the thread-router digest"

Task 5: listThreadDigest + updateThreadSummary (Layer B6)

Files: Modify packages/chat-runtime/src/threads.ts (+ index export); add tests to threads.test.ts.

Step 1: Write the failing test — contract test for the type; if the file has a DB harness, ADD a real test (insert 11 threads for the user → expect length 10, newest-first; a thread by another user/org is excluded — use the harness's insert helpers, don't invent DB setup):

import { describe, expect, it } from "vitest"
import type { ThreadDigestEntry } from "./threads"

describe("ThreadDigestEntry shape", () => {
  it("carries id, title, summary, lastActivityAt", () => {
    const entry: ThreadDigestEntry = {
      id: "thr_1", title: "Lease review",
      summary: "Discussed the rent escalation clause.",
      lastActivityAt: new Date("2026-06-12T10:00:00Z"),
    }
    expect(entry.summary).toBe("Discussed the rent escalation clause.")
  })
})

Step 2: Run to verify it failscd packages/chat-runtime && bun test src/threads.test.ts. Expected: FAIL, ThreadDigestEntry not exported.

Step 3: Implement (near listThreads; reuses withRlsTransaction, and, eq, desc, chatThread already imported)

/** One entry in the desktop-router's recent-threads digest. */
export interface ThreadDigestEntry {
  id: string
  title: string
  summary: string | null
  lastActivityAt: Date
}

/**
 * The caller's most-recent threads (newest first), capped, for the router.
 * One row per thread — no message-table fan-out (summary is denormalized).
 * Firm + created_by scoped, exactly like listThreads.
 */
export async function listThreadDigest(
  organizationId: string,
  userId: string,
  limit = 10
): Promise<ThreadDigestEntry[]> {
  return withRlsTransaction(db, organizationId, async (tx) => {
    const rows = await tx
      .select({
        id: chatThread.id,
        title: chatThread.title,
        summary: chatThread.summary,
        updatedAt: chatThread.updatedAt,
      })
      .from(chatThread)
      .where(
        and(
          eq(chatThread.organizationId, organizationId),
          eq(chatThread.createdBy, userId)
        )
      )
      .orderBy(desc(chatThread.updatedAt))
      .limit(limit)
    return rows.map((r) => ({
      id: r.id,
      title: r.title,
      summary: r.summary,
      lastActivityAt: r.updatedAt,
    }))
  })
}

/**
 * Writes the denormalized one-line summary for a thread the caller owns.
 * Silent no-op if not the caller's, mirroring updateThreadModel.
 */
export async function updateThreadSummary(
  organizationId: string,
  userId: string,
  threadId: string,
  summary: string
): Promise<void> {
  await withRlsTransaction(db, organizationId, async (tx) => {
    await tx
      .update(chatThread)
      .set({ summary })
      .where(
        and(
          eq(chatThread.id, threadId),
          eq(chatThread.organizationId, organizationId),
          eq(chatThread.createdBy, userId)
        )
      )
  })
}

Step 4: Export — in packages/chat-runtime/src/index.ts, export listThreadDigest, updateThreadSummary, and the ThreadDigestEntry type (match the file's existing export style; if it uses export * from "./threads", no edit needed — verify).

Step 5: Run tests — PASS.

Step 6: Commit

git add packages/chat-runtime/src/threads.ts packages/chat-runtime/src/threads.test.ts packages/chat-runtime/src/index.ts
git commit -m "feat(chat-runtime): listThreadDigest + updateThreadSummary for the router"

Task 6: desktop-router agent — prompt + route (Layer B1–B5, B8)

Files: Create desktop-router/{prompt,route,index}.ts + route.test.ts; export from the package index.

Step 1: Write the promptdesktop-router/prompt.ts

export const DESKTOP_ROUTER_PROMPT = `You route a lawyer's spoken request to the right conversation thread on their desktop assistant. You do NOT answer the request — you only pick the thread.

You receive: the new spoken request, an optional note about the thread they were just on (the "active thread"), and a digest of their last ~10 threads (id, title, one-line summary).

Call route EXACTLY ONCE with one of:
- kind "continue": the request is a follow-up to the ACTIVE thread's topic. Default when there is an active thread and the new request clearly continues it.
- kind "resume": the request unambiguously refers to an OLDER thread in the digest (e.g. "pull up that Brooklyn Gardens lease from yesterday"). Set threadId to that thread's id.
- kind "new": a new topic, OR there is no active thread, OR you are at all unsure. When in doubt, choose "new" — mis-routing a conversation into the wrong thread is worse than starting a fresh one.

Be conservative. Only "resume" on a clear, specific match to a digest entry. Only "continue" when the topic genuinely carries over. Otherwise "new".

Respond ONLY by calling route. No prose.`

Step 2: Write the failing testdesktop-router/route.test.ts

import { describe, expect, it } from "vitest"
import { AIMessage } from "@langchain/core/messages"
import type { BaseChatModel } from "@langchain/core/language_models/chat_models"
import { routeDesktopTurn } from "./route"

function modelReturning(args: unknown): BaseChatModel {
  const m = {
    bindTools: () => m,
    invoke: async () =>
      new AIMessage({ content: "", tool_calls: [{ id: "r", name: "route", args: args as Record<string, unknown> }] }),
  } as unknown as BaseChatModel
  return m
}

const digest = [
  { id: "thr_active", title: "145 W 42nd deal", summary: "Reviewed the PSA.", lastActivityAt: new Date() },
  { id: "thr_old", title: "Brooklyn Gardens lease", summary: "Rent escalation.", lastActivityAt: new Date() },
]

describe("routeDesktopTurn", () => {
  it("returns continue with the active thread id", async () => {
    expect(await routeDesktopTurn({ request: "and the closing date?", activeThreadId: "thr_active", digest, model: modelReturning({ kind: "continue" }) }))
      .toEqual({ kind: "continue", threadId: "thr_active" })
  })
  it("returns resume with the referenced older thread id", async () => {
    expect(await routeDesktopTurn({ request: "pull up that Brooklyn Gardens lease", activeThreadId: "thr_active", digest, model: modelReturning({ kind: "resume", threadId: "thr_old" }) }))
      .toEqual({ kind: "resume", threadId: "thr_old" })
  })
  it("returns new on a fresh topic", async () => {
    expect(await routeDesktopTurn({ request: "draft an NDA", activeThreadId: "thr_active", digest, model: modelReturning({ kind: "new" }) }))
      .toEqual({ kind: "new" })
  })
  it("continue with no active thread is incoherent -> new", async () => {
    expect(await routeDesktopTurn({ request: "hello", activeThreadId: undefined, digest, model: modelReturning({ kind: "continue" }) }))
      .toEqual({ kind: "new" })
  })
  it("resume to a thread not in the digest -> safe default", async () => {
    expect(await routeDesktopTurn({ request: "x", activeThreadId: "thr_active", digest, model: modelReturning({ kind: "resume", threadId: "thr_ghost" }) }))
      .toEqual({ kind: "continue", threadId: "thr_active" })
  })
  it("falls back to new on malformed output", async () => {
    expect(await routeDesktopTurn({ request: "x", activeThreadId: "thr_active", digest, model: modelReturning({ kind: "garbage" }) }))
      .toEqual({ kind: "continue", threadId: "thr_active" })
  })
  it("falls back (continue active) when the model throws", async () => {
    const throwing = { bindTools: () => throwing, invoke: async () => { throw new Error("timeout") } } as unknown as BaseChatModel
    expect(await routeDesktopTurn({ request: "x", activeThreadId: "thr_active", digest, model: throwing }))
      .toEqual({ kind: "continue", threadId: "thr_active" })
  })
})

Note: the safe default for a malformed/missing-route output is "continue active if present, else new" — that is why the "garbage" and "ghost" cases resolve to continue thr_active, and the no-active cases resolve to new.

Step 3: Run to verify it failsbun test src/agents/desktop-router/route.test.ts. Expected: FAIL, module not found.

Step 4: Implement route.ts

import type { BaseChatModel } from "@langchain/core/language_models/chat_models"
import { AIMessage, HumanMessage, SystemMessage } from "@langchain/core/messages"
import { tool } from "@langchain/core/tools"
import { z } from "zod"

import { createModel } from "../../model"
import { DESKTOP_ROUTER_PROMPT } from "./prompt"

export interface RouterDigestEntry {
  id: string
  title: string
  summary: string | null
  lastActivityAt: Date
}

export type RouteDecision =
  | { kind: "continue"; threadId: string }
  | { kind: "resume"; threadId: string }
  | { kind: "new" }

const routeArgsSchema = z.object({
  kind: z.enum(["continue", "resume", "new"]),
  threadId: z.string().min(1).optional(),
})

const routeTool = tool(async () => "routed", {
  name: "route",
  description:
    "Pick the thread for this turn. 'continue' = follow-up to the active thread; " +
    "'resume' = an older thread via threadId; 'new' = a new topic or when unsure.",
  schema: routeArgsSchema,
})

export interface RouteDesktopTurnInput {
  request: string
  activeThreadId?: string
  digest: RouterDigestEntry[]
  screenshot?: string
  model?: BaseChatModel
}

/**
 * One Haiku classification → a thread RouteDecision. NEVER throws: any model
 * error / timeout / malformed output degrades to the safe default (continue
 * the active thread if there is one, else new) so a routing outage is exactly
 * today's continuity, never a crash.
 */
export async function routeDesktopTurn(
  input: RouteDesktopTurnInput
): Promise<RouteDecision> {
  const safeDefault = (): RouteDecision =>
    input.activeThreadId
      ? { kind: "continue", threadId: input.activeThreadId }
      : { kind: "new" }

  try {
    const model = input.model ?? createModel("claude-haiku-4-5")
    const modelWithTools = model.bindTools?.([routeTool]) ?? model

    const digestLines = input.digest
      .map((entry, i) => `${i + 1}. id=${entry.id} | title="${entry.title}" | summary="${entry.summary ?? "(none)"}"`)
      .join("\n")
    const activeNote = input.activeThreadId
      ? `Active thread id (the one they were just on): ${input.activeThreadId}`
      : "There is no active thread."
    const humanText =
      `${activeNote}\n\nRecent threads:\n${digestLines || "(none)"}\n\n` +
      `New spoken request: ${input.request}`

    const humanContent: HumanMessage["content"] = input.screenshot
      ? [
          { type: "text", text: humanText },
          { type: "image_url", image_url: { url: `data:image/jpeg;base64,${input.screenshot}` } },
        ]
      : humanText

    const response = await modelWithTools.invoke([
      new SystemMessage(DESKTOP_ROUTER_PROMPT),
      new HumanMessage({ content: humanContent }),
    ])

    const call = AIMessage.isInstance(response)
      ? response.tool_calls?.find((c) => c.name === "route")
      : undefined
    if (!call) return safeDefault()

    const parsed = routeArgsSchema.safeParse(call.args)
    if (!parsed.success) return safeDefault()

    const { kind, threadId } = parsed.data
    if (kind === "new") return { kind: "new" }
    if (kind === "continue") {
      return input.activeThreadId
        ? { kind: "continue", threadId: input.activeThreadId }
        : { kind: "new" }
    }
    // resume: require a threadId that exists in the digest.
    if (threadId && input.digest.some((entry) => entry.id === threadId)) {
      return { kind: "resume", threadId }
    }
    return safeDefault()
  } catch {
    return safeDefault()
  }
}

Step 5: Create the indexdesktop-router/index.ts

export { routeDesktopTurn } from "./route"
export type { RouteDecision, RouteDesktopTurnInput, RouterDigestEntry } from "./route"
export { DESKTOP_ROUTER_PROMPT } from "./prompt"

Step 6: Export from the package index — in packages/agent-runtime/src/index.ts:

export {
  routeDesktopTurn,
  type RouteDecision,
  type RouteDesktopTurnInput,
  type RouterDigestEntry,
} from "./agents/desktop-router"

Step 7: Run tests — 7 tests PASS.

Step 8: Commit

git add packages/agent-runtime/src/agents/desktop-router/ packages/agent-runtime/src/index.ts
git commit -m "feat(agent): desktop-router agent — routes a voice turn to continue/resume/new"

Task 7: Wire the router into the gateway dispatch (Layer B2, B7)

Files: Modify apps/clicky-gateway/src/dispatch.ts; add cases to dispatch.test.ts (mirror the existing seam-based tests).

Step 1: Write the failing tests — add a routeDesktopTurnImpl + listThreadDigestImpl seam to the test calls:

it("obeys a resume decision: passes the older thread id to runDesktopChat", async () => {
  let seenThreadId: string | undefined
  await dispatch({
    organizationId: "org_1", userId: "user_1",
    input: { request: "pull up Brooklyn Gardens", threadId: "thr_active" },
    routeDesktopTurnImpl: async () => ({ kind: "resume", threadId: "thr_old" }),
    listThreadDigestImpl: async () => [],
    runDesktopChatImpl: async (args) => { seenThreadId = args.threadId; return new Response("ok") },
  })
  expect(seenThreadId).toBe("thr_old")
})

it("obeys a new decision: creates a thread and passes the new id", async () => {
  let seenThreadId: string | undefined
  await dispatch({
    organizationId: "org_1", userId: "user_1",
    input: { request: "draft an NDA", threadId: "thr_active" },
    routeDesktopTurnImpl: async () => ({ kind: "new" }),
    listThreadDigestImpl: async () => [],
    createThreadImpl: async () => "thr_created",
    runDesktopChatImpl: async (args) => { seenThreadId = args.threadId; return new Response("ok") },
  })
  expect(seenThreadId).toBe("thr_created")
})

it("obeys a continue decision: keeps the active thread", async () => {
  let seenThreadId: string | undefined
  await dispatch({
    organizationId: "org_1", userId: "user_1",
    input: { request: "and the closing date?", threadId: "thr_active" },
    routeDesktopTurnImpl: async () => ({ kind: "continue", threadId: "thr_active" }),
    listThreadDigestImpl: async () => [],
    runDesktopChatImpl: async (args) => { seenThreadId = args.threadId; return new Response("ok") },
  })
  expect(seenThreadId).toBe("thr_active")
})

Step 2: Run to verify it failscd apps/clicky-gateway && bun test src/dispatch.test.ts. Expected: FAIL, seams not accepted / router not wired.

Step 3: Implement the wiring — imports:

import { createThread, runDesktopChat, listThreadDigest } from "@workspace/chat-runtime"
import { routeDesktopTurn } from "@workspace/agent-runtime"
import type { RouteDecision } from "@workspace/agent-runtime"

Add seams to DispatchInput:

  routeDesktopTurnImpl?: typeof routeDesktopTurn
  listThreadDigestImpl?: typeof listThreadDigest

Replace the thread-resolution block in dispatch(...):

  const createThreadFn = req.createThreadImpl ?? createThread
  const routeFn = req.routeDesktopTurnImpl ?? routeDesktopTurn
  const digestFn = req.listThreadDigestImpl ?? listThreadDigest

  // Router pre-step (spec Layer B): decide which thread this turn belongs to.
  // The router NEVER throws — a failure degrades to continue-active-or-new.
  const digest = await digestFn(req.organizationId, req.userId, 10)
  const decision: RouteDecision = await routeFn({
    request: req.input.request,
    activeThreadId: req.input.threadId,
    digest: digest.map((entry) => ({
      id: entry.id, title: entry.title,
      summary: entry.summary, lastActivityAt: entry.lastActivityAt,
    })),
    screenshot: req.input.screenshot,
  })

  const threadId =
    decision.kind === "new"
      ? await createThreadFn({
          organizationId: req.organizationId,
          userId: req.userId,
          agentId: DESKTOP_SUPERVISOR_AGENT_ID,
          title: req.input.request.slice(0, 80) || "Clicky",
        })
      : decision.threadId

  console.log(
    `[clicky-gateway] dispatch routed: mode=interactive agent=${DESKTOP_SUPERVISOR_AGENT_ID}` +
      ` route=${decision.kind} thread=${threadId}` +
      ` screenshot=${req.input.screenshot ? "present" : "none"}`
  )

Leave the runDesktopChatFn call below unchanged — it already receives threadId.

Step 4: Run tests — existing + 3 new PASS.

Step 5: Commit

git add apps/clicky-gateway/src/dispatch.ts apps/clicky-gateway/src/dispatch.test.ts
git commit -m "feat(gateway): route each voice turn via desktop-router before dispatch"

Task 8: Write thread summary on turn end + open drafts in the browser (Layer A4, B6)

Files: Modify packages/chat-runtime/src/run.ts (the runDesktopChat companion-completion block); add cases to run.test.ts via the existing test seams.

Step 1: Write the failing test (summary written) — add an updateThreadSummaryImpl seam; use the file's existing fake-graph + drain helpers:

it("writes the thread summary from finalize_answer.spoken on completion", async () => {
  let savedSummary: string | undefined
  const res = await runDesktopChat({
    organizationId: "org_1", userId: "user_1", threadId: "thr_1", agentId: "desktop",
    request: "give me context",
    getAgentImpl: makeFakeDesktopGraph({ spoken: "The PSA is signed.", intent: "new_request" }),
    updateThreadSummaryImpl: async (_o, _u, _t, summary) => { savedSummary = summary },
    ...baseFakes,
  })
  await drain(res)
  expect(savedSummary).toBe("The PSA is signed.")
})

Step 2: Write the failing test (draft interrupt → open in browser)

it("opens the draft in the browser when the companion interrupts on a write", async () => {
  const res = await runDesktopChat({
    organizationId: "org_1", userId: "user_1", threadId: "thr_1", agentId: "desktop",
    request: "draft an NDA",
    getAgentImpl: makeFakeDesktopGraphWithPendingInterrupt(),
    ...baseFakes,
  })
  const final = await readFinalChunk(res)
  expect(final.openInBrowser).toBe(true)
  expect(final.deepLink).toContain("/tasks/chats/thr_1")
  expect(final.spoken.toLowerCase()).toContain("draft")
})

Helpers: if makeFakeDesktopGraph / makeFakeDesktopGraphWithPendingInterrupt / drain / readFinalChunk don't exist, build them by mirroring the existing desktop fake-graph + stream-reading already in run.test.ts. The pending-interrupt fake's getState returns { values: { messages }, tasks: [{ interrupts: [{ value: { kind: "document_write_confirm" } }] }] } and its stream emits a companion message then ends WITHOUT a finalize_answer.

Step 3: Run to verify they failcd packages/chat-runtime && bun test src/run.test.ts.

Step 4: Add the seam + summary write — import updateThreadSummary from ./threads; add to RunDesktopChatInput: updateThreadSummaryImpl?: typeof updateThreadSummary; resolve const updateThreadSummaryFn = input.updateThreadSummaryImpl ?? updateThreadSummary. In the completion block, after harvestFinalizeAnswer + prose persistence:

          // Denormalize the spoken summary onto the thread for the router's
          // digest (best-effort — a summary write must never fail the turn).
          const summaryText = finalize?.spoken ?? companionProse.slice(0, 280)
          if (summaryText.length > 0) {
            try {
              await updateThreadSummaryFn(
                input.organizationId, input.userId, input.threadId,
                summaryText.slice(0, 280)
              )
            } catch (error) {
              console.error(`Failed to write thread summary for ${input.threadId}`, error)
            }
          }

Step 5: Detect the pending write-interrupt → open in browser — after const state = await graph.getState(...), BEFORE the normal-final emit:

          // Layer A4: a draft / write tool pauses the graph on an in-tool
          // interrupt (document_write_confirm / document_edit_review). Voice
          // has no confirm card, so we DON'T commit the write — we open the
          // thread in the browser (where the confirm/edit UI lives) and speak a
          // short line. Detected by a pending task interrupt with no
          // finalize_answer harvested.
          const hasPendingInterrupt =
            !finalize &&
            Array.isArray((state as { tasks?: unknown[] }).tasks) &&
            (state as { tasks: { interrupts?: unknown[] }[] }).tasks.some(
              (task) => (task.interrupts?.length ?? 0) > 0
            )
          if (hasPendingInterrupt) {
            const draftDeepLink = buildDeepLink(webBase, input.threadId, null)
            await setPendingDeepLinkFn(
              input.organizationId, input.userId, input.threadId, draftDeepLink
            )
            emit(buildClickyFinalChunk({
              spoken: "I've drafted that — opening North so you can review and save it.",
              openInBrowser: true,
              deepLink: draftDeepLink,
              threadId: input.threadId,
            }))
            finalSeen = true
            emit({ type: "finish" })
            controller.close()
            return
          }

emit, finalSeen, controller, setPendingDeepLinkFn, buildDeepLink, webBase are all already in scope in this block.

Step 6: Run tests — PASS (summary written; draft opens in browser).

Step 7: Commit

git add packages/chat-runtime/src/run.ts packages/chat-runtime/src/run.test.ts
git commit -m "feat(chat-runtime): write thread summary on turn end; open drafts in the browser"

Task 9: Harden the error-path checkpoint config (Layer A5)

Files: Modify packages/chat-runtime/src/run.ts (catch + any getState).

Step 1: Audit — confirm BOTH graph.stream(...) (≈ line 777) and the completion graph.getState(...) (≈ line 944) pass configurable.thread_id: input.threadId (they do). There must be NO bare getState() in the error handling.

Step 2: Guard test — a desktop run whose graph throws still closes with a parseable fallback final and never reads state config-less:

it("a throwing desktop graph emits a fallback final", async () => {
  const res = await runDesktopChat({
    organizationId: "org_1", userId: "user_1", threadId: "thr_1", agentId: "desktop",
    request: "boom",
    getAgentImpl: makeThrowingDesktopGraph(), // stream() throws after start
    ...baseFakes,
  })
  const final = await readFinalChunk(res)
  expect(final.spoken).toContain("something went wrong")
})

Step 3: Run the test — PASS (existing 64f47bf fallback preserved; no NULL-thread_id state read).

Step 4: Commit

git add packages/chat-runtime/src/run.ts packages/chat-runtime/src/run.test.ts
git commit -m "fix(chat-runtime): desktop error path never reads checkpoint state without thread_id"

Task 10: Full verification + docs

Step 1: Typecheck + lint from rootbun run typecheck && bun run lint. Expected: PASS.

Step 2: Touched-package test suites (export surfaces changed → full suites per CLAUDE.md):

cd packages/agent-runtime && bun test
cd ../chat-runtime && bun test
cd ../../apps/clicky-gateway && bun test

Step 3: Apply the migration locally

bun run db:up
bun run db:migrate
bun --filter=@workspace/db provision

Step 4: Update CLAUDE.md — under "Agent runtime + skills", add desktop-router to the agent list (one line: routes voice turns to continue/resume/new).

Step 5: Commit docsgit add CLAUDE.md && git commit -m "docs: note the desktop-router agent in the runtime overview".

Step 6: Pushgit push.

Notes for the executor