← plans · north_os · DESIGN.md18 juin 2026 à 15:13

Smart Proactive Glance Implementation Plan

For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox ([ ]) syntax for tracking.

Goal: Make the proactive glance read the real Outlook thread before proposing, so it surfaces the right action (or stays silent) instead of proposing a draft on every email — v1 = email path only.

Architecture: Two backend stages. Stage 1 (existing proactive-glance Haiku agent, recalibrated) gates on the screenshot and extracts search anchors (subject/sender/query) via a new finalize_glance_triage tool. Stage 2 (new glance-analyst Sonnet agent, un-checkpointed, read-only search_email+read_email_thread tools) uses those anchors to find and read the thread, then decides surface:false or a precise nudge via the unchanged finalize_glance. runProactiveGlance orchestrates the two stages, passing {configurable:{org,user}} (no thread_id), a low recursion limit, and a wall-clock timeout. Polaris and the /proactive-glance wire contract are unchanged.

Tech Stack: TypeScript, LangGraph 1.3 (StateGraph, getAgent/registry), Zod, Vitest, Bun. Spec: docs/superpowers/specs/2026-06-18-smart-proactive-glance-design.md.

File Structure

New files:

Modified files:

Reused as-is (read, do NOT modify): tools/search-email.ts (needs org+user, thread_id optional, takes query), tools/read-email-thread.ts (same, takes conversation_id), agents/screen-companion/graph.ts (loop shape), agents/proactive-glance/graph.ts:79-93 (no-checkpointer compile), agents/desktop-router/route.ts:115-121 (Promise.race timeout), chat-runtime/src/threads.ts:132 (listThreadDigest), tools/finalize-glance.ts (unchanged stage-2 output).

Task 1: Add the analyst recursion limit constant

Files: Modify packages/agent-runtime/src/recursion.ts, packages/agent-runtime/src/index.ts.

[ ] Step 1: Add the constant. In recursion.ts, append after AGENT_RECURSION_LIMIT:

/**
 * Recursion budget for the glance-analyst (the proactive glance's stage-2
 * email reader). It needs only a handful of tool round-trips —
 * search_email → read_email_thread → finalize — so this is deliberately LOW
 * (unlike the desktop's 64). A small cap means a confused analyst that loops
 * fails fast to GraphRecursionError, which runProactiveGlance treats as
 * "surface:false" — silence beats a runaway Sonnet+Graph loop on the nudge path.
 */
export const GLANCE_ANALYST_RECURSION_LIMIT = 8

[ ] Step 2: Export it. In index.ts, extend the existing recursion export:

export { AGENT_RECURSION_LIMIT, GLANCE_ANALYST_RECURSION_LIMIT } from "./recursion"

[ ] Step 3: Typecheck. Run cd packages/agent-runtime && bun run typecheck → PASS.

[ ] Step 4: Commit.

git add packages/agent-runtime/src/recursion.ts packages/agent-runtime/src/index.ts
git commit -m "feat(glance): add GLANCE_ANALYST_RECURSION_LIMIT constant"

Task 2: The finalize_glance_triage tool + schema

Files: Create packages/agent-runtime/src/tools/finalize-glance-triage.ts + .test.ts; Modify src/index.ts.

[ ] Step 1: Write the failing test (finalize-glance-triage.test.ts):

import { AIMessage } from "@langchain/core/messages"
import { describe, expect, it } from "vitest"
import {
  finalizeGlanceTriageArgsSchema,
  harvestFinalizeGlanceTriage,
} from "./finalize-glance-triage"

describe("finalize_glance_triage", () => {
  it("accepts digIn with a candidate", () => {
    const parsed = finalizeGlanceTriageArgsSchema.safeParse({
      digIn: true,
      candidate: { subject: "RE: SNDA", sender: "Marcy Howard", query: "SNDA Marcy" },
    })
    expect(parsed.success).toBe(true)
  })
  it("accepts digIn:false with nothing else", () => {
    expect(finalizeGlanceTriageArgsSchema.safeParse({ digIn: false }).success).toBe(true)
  })
  it("harvests the triage args from the last AI message's tool call", () => {
    const messages = [
      new AIMessage({
        content: "",
        tool_calls: [
          { type: "tool_call", name: "finalize_glance_triage", id: "t1",
            args: { digIn: true, candidate: { query: "Brooklyn Gardens" } } },
        ],
      }),
    ]
    const out = harvestFinalizeGlanceTriage(messages)
    expect(out?.digIn).toBe(true)
    expect(out?.candidate?.query).toBe("Brooklyn Gardens")
  })
  it("returns null when no triage tool call is present", () => {
    expect(harvestFinalizeGlanceTriage([new AIMessage({ content: "hi" })])).toBeNull()
  })
})

[ ] Step 2: Run to verify it fails. cd packages/agent-runtime && bun run test -- src/tools/finalize-glance-triage.test.ts → FAIL (cannot resolve module).

[ ] Step 3: Write the tool (finalize-glance-triage.ts):

import { AIMessage, type BaseMessage } from "@langchain/core/messages"
import { tool } from "@langchain/core/tools"
import { z } from "zod"

export const FINALIZE_GLANCE_TRIAGE_TOOL_NAME = "finalize_glance_triage"

/**
 * Stage-1 (triage) output. The triage looks at ONE screenshot and decides
 * whether it is an inbound email worth a deeper look, and if so extracts the
 * search anchors the stage-2 analyst needs to find the thread. It does NOT
 * compose the final nudge — that is the analyst's job (finalize_glance).
 */
export const finalizeGlanceTriageArgsSchema = z.object({
  digIn: z.boolean(),
  reason: z.string().optional(),
  candidate: z
    .object({
      subject: z.string().optional(),
      sender: z.string().optional(),
      query: z.string().optional(),
    })
    .optional(),
})

export type FinalizeGlanceTriageArgs = z.infer

export const finalizeGlanceTriageTool = tool(
  async (args): Promise => (args.digIn ? "Dig in." : "Nothing to dig into."),
  {
    name: FINALIZE_GLANCE_TRIAGE_TOOL_NAME,
    description:
      "Call EXACTLY ONCE as your only action. `digIn`: true only when the " +
      "screen shows an inbound EMAIL that plausibly needs the lawyer's " +
      "attention. When digIn=true, fill `candidate` with what you can read off " +
      "the screen: the email `subject`, the `sender`, and a 3-5 word `query` " +
      "for a mailbox search. If you cannot read a confident subject/sender, " +
      "set digIn=false — we stay silent rather than search blind.",
    schema: finalizeGlanceTriageArgsSchema,
  },
)

export function harvestFinalizeGlanceTriage(
  messages: BaseMessage[],
): FinalizeGlanceTriageArgs | null {
  for (let i = messages.length - 1; i >= 0; i--) {
    const message = messages[i]
    if (message == null || !AIMessage.isInstance(message)) continue
    const call = message.tool_calls?.find((c) => c.name === FINALIZE_GLANCE_TRIAGE_TOOL_NAME)
    if (!call) continue
    const parsed = finalizeGlanceTriageArgsSchema.safeParse(call.args)
    return parsed.success ? parsed.data : null
  }
  return null
}

[ ] Step 4: Run to verify it passes → PASS (4 tests).

[ ] Step 5: Export. In src/index.ts, after the existing finalize-glance export block:

export {
  finalizeGlanceTriageTool,
  harvestFinalizeGlanceTriage,
  finalizeGlanceTriageArgsSchema,
  FINALIZE_GLANCE_TRIAGE_TOOL_NAME,
  type FinalizeGlanceTriageArgs,
} from "./tools/finalize-glance-triage"

[ ] Step 6: Typecheck + commit.

git add packages/agent-runtime/src/tools/finalize-glance-triage.ts packages/agent-runtime/src/tools/finalize-glance-triage.test.ts packages/agent-runtime/src/index.ts
git commit -m "feat(glance): finalize_glance_triage tool — stage-1 dig/silent + search anchors"

Task 3: Recalibrate the stage-1 triage (prompt + graph)

Files: Modify agents/proactive-glance/prompt.ts, graph.ts, graph.test.ts.

[ ] Step 1: Rewrite the triage prompt. Replace the entire prompt.ts contents:

export const PROACTIVE_GLANCE_PROMPT = `You are a fast triage step glancing — UNINVITED — at ONE screenshot of a lawyer's screen the moment they switched into an app. You do NOT decide what to do; a deeper analyst does that next. Your ONE job: decide whether the screen is an inbound EMAIL worth reading the thread for, and if so, extract what the analyst needs to find it.

Call finalize_glance_triage EXACTLY ONCE.

SET digIn:true ONLY when the screen clearly shows an inbound email (an open message or a focused message in a mail client) that plausibly needs the lawyer's attention. When digIn:true, fill candidate with what you can READ off the screen:
- subject: the email's subject line (the area at the top of the open message).
- sender: who the email is FROM (the From: line — NEVER a name from a "Hi ," salutation, which addresses the recipient/lawyer).
- query: 3-5 keywords from the subject + sender for a mailbox search (e.g. "Marcy Howard SNDA Brooklyn Gardens").

SET digIn:false for everything else — settings, a blank document, a document/PDF/contract being viewed (v1 handles email only), a terminal or IDE, a dashboard, personal browsing, an obvious newsletter/FYI, OR a screen that is already a conversation / chat / AI-assistant UI (the lawyer is already in a conversation there; North runs in a browser tab, so judge from what is ON SCREEN, not the app name).

If you cannot confidently read the subject AND sender of the email, set digIn:false — we stay SILENT rather than search the mailbox blind. A false "silent" is fine; a wrong search is worse.

Always answer in English. Respond ONLY by calling finalize_glance_triage.`

[ ] Step 2: Bind the new tool in the graph. In graph.ts, change the import finalizeGlanceTool from "../../tools/finalize-glance"finalizeGlanceTriageTool from "../../tools/finalize-glance-triage", and the default tools line deps.tools ?? [finalizeGlanceTool]deps.tools ?? [finalizeGlanceTriageTool]. No other change (one-shot structure + no-checkpointer compile stay).

[ ] Step 3: Update the triage graph test. Wherever the fake model emits a finalize_glance call with {surface, message, actionPrompt}, replace with a finalize_glance_triage call and assert on harvested triage args:

// fake model emits:
tool_calls: [
  { type: "tool_call", name: "finalize_glance_triage", id: "t1",
    args: { digIn: true, candidate: { subject: "RE: SNDA", sender: "Marcy", query: "SNDA Marcy" } } },
]
// assertion (import harvestFinalizeGlanceTriage):
const out = harvestFinalizeGlanceTriage(result.messages)
expect(out?.digIn).toBe(true)
expect(out?.candidate?.query).toBe("SNDA Marcy")
// "stays silent" test emits { digIn: false } and asserts out?.digIn === false.

[ ] Step 4: Run the triage tests → PASS.

[ ] Step 5: Typecheck + commit.

git add packages/agent-runtime/src/agents/proactive-glance/prompt.ts packages/agent-runtime/src/agents/proactive-glance/graph.ts packages/agent-runtime/src/agents/proactive-glance/graph.test.ts
git commit -m "feat(glance): recalibrate stage-1 triage — dig/silent + email anchor extraction"

Task 4: The glance-analyst agent — prompt + graph

Files: Create agents/glance-analyst/prompt.ts, graph.ts, graph.test.ts.

[ ] Step 1: Write the analyst prompt (prompt.ts):

export const GLANCE_ANALYST_PROMPT = `You are North's proactive analyst. The lawyer did NOT ask you anything — a fast triage step noticed an inbound email on their screen and handed you its details. Your job: READ the real email thread, judge whether the lawyer actually needs to act, and either propose the RIGHT next action or stay silent.

You are given (in the human message): the email candidate (subject, sender, a search query) read off the screen, and a digest of the lawyer's recent North conversations.

HOW TO WORK
- First call search_email with the candidate's query to find the thread. Then call read_email_thread with the matching conversation_id to read the FULL thread.
- CONFIDENT-MATCH GATE (mandatory): only proceed if search_email returns a SINGLE thread that clearly matches the candidate's subject AND sender. If it returns zero hits, several ambiguous hits, or nothing matching the sender, call finalize_glance with surface:false — proposing a reply to the wrong email is worse than staying silent.
- Read the thread to judge: does it actually ask the lawyer something / await their reply? Or is it a FYI, an acknowledgement, a newsletter, or something the lawyer ALREADY replied to (check whether the latest message is from the lawyer)? Use the conversation digest to see if the lawyer already has a North thread handling this.

DECIDE — call finalize_glance EXACTLY ONCE:
- surface:false when there is no useful action: informational, already answered, ambiguous to identify, or the thread could not be read. This is a correct, common outcome.
- surface:true ONLY when the thread genuinely needs the lawyer's action. message: ONE short English sentence (<= ~120 chars) naming the real situation and the offer ("Marcy's waiting on your read of SNDA §4.2(b) — want me to draft the reply?"). actionPrompt: a first-person instruction the lawyer could have spoken, carrying the specifics you learned ("Draft a reply to Marcy Howard's email about the SNDA §4.2(b) cure-period language, confirming my review covers it").

Cite only what the thread actually says — never invent. Always answer in English. Respond ONLY by calling finalize_glance.`

[ ] Step 2: Write the analyst graph test (failing) (graph.test.ts) — full body below. A stateful scripted model returns a different tool call per invoke() (search → read → finalize); fake search_email/read_email_thread tools return canned strings. Case (a): a confident hit → surface:true. Case (b): "0 threads" → surface:false. Case (c): org/user reach the tools, no thread_id.

DB-import isolation (dual review): ./graph statically imports the REAL searchEmailTool/readEmailThreadTool, which transitively import @workspace/db (via tools/email-access.ts) — and @workspace/db THROWS at module load without DATABASE_URL. So the test must MOCK those tool modules before importing ./graph, exactly as search-email.test.ts/read-email-thread.test.ts do. Put these at the top of graph.test.ts:

import { vi } from "vitest"
vi.mock("../../tools/search-email", () => ({ searchEmailTool: { name: "search_email" } }))
vi.mock("../../tools/read-email-thread", () => ({ readEmailThreadTool: { name: "read_email_thread" } }))
// (The graph's DEFAULT tools are never used in these tests — every test passes
//  its own fake `tools` array — but the static import must not load @workspace/db.)
import { AIMessage, HumanMessage } from "@langchain/core/messages"
import { tool } from "@langchain/core/tools"
import { describe, expect, it } from "vitest"
import { z } from "zod"
import { buildGlanceAnalystGraph } from "./graph"
import { finalizeGlanceTool } from "../../tools/finalize-glance"

// Fake search_email + read_email_thread (keyed by the real tool names the graph dispatches on).
function fakeEmailTools(opts: { searchResult: string; threadResult: string }) {
  const searchEmail = tool(async () => opts.searchResult, {
    name: "search_email", description: "fake",
    schema: z.object({ query: z.string() }),
  })
  const readEmailThread = tool(async () => opts.threadResult, {
    name: "read_email_thread", description: "fake",
    schema: z.object({ conversation_id: z.string() }),
  })
  return [searchEmail, readEmailThread, finalizeGlanceTool]
}

// Scripted model: turn 1 searches, turn 2 reads, turn 3 finalizes with finalArgs.
function scriptedModel(finalArgs: Record) {
  let turn = 0
  return {
    bindTools() { return this },
    async invoke() {
      turn += 1
      if (turn === 1) {
        return new AIMessage({ content: "", tool_calls: [
          { type: "tool_call", name: "search_email", id: "s", args: { query: "SNDA Marcy" } },
        ] })
      }
      if (turn === 2) {
        return new AIMessage({ content: "", tool_calls: [
          { type: "tool_call", name: "read_email_thread", id: "r", args: { conversation_id: "c1" } },
        ] })
      }
      return new AIMessage({ content: "", tool_calls: [
        { type: "tool_call", name: "finalize_glance", id: "f", args: finalArgs },
      ] })
    },
  }
}

function finalizeArgs(out: { messages: unknown[] }) {
  const msg = out.messages.find(
    (m) => AIMessage.isInstance(m) && m.tool_calls?.some((c) => c.name === "finalize_glance"),
  ) as AIMessage | undefined
  return msg?.tool_calls?.find((c) => c.name === "finalize_glance")?.args
}

describe("glance-analyst graph", () => {
  it("searches, reads the thread, then finalizes with a surfaced action", async () => {
    const graph = buildGlanceAnalystGraph({
      model: scriptedModel({
        surface: true,
        message: "Marcy needs your SNDA read — draft the reply?",
        actionPrompt: "Draft a reply to Marcy about the SNDA cure-period language",
      }) as never,
      tools: fakeEmailTools({
        searchResult: "1 thread: conversation_id: c1 | RE: SNDA from Marcy",
        threadResult: "Marcy: can you confirm your review covers §4.2(b)?",
      }),
    })
    const out = await graph.invoke({ messages: [new HumanMessage("candidate: SNDA from Marcy. digest: (none)")] })
    const args = finalizeArgs(out)
    expect(args?.surface).toBe(true)
    expect(String(args?.message)).toContain("Marcy")
  })

  it("finalizes surface:false when the search is ambiguous / empty", async () => {
    const graph = buildGlanceAnalystGraph({
      model: scriptedModel({ surface: false }) as never,
      tools: fakeEmailTools({ searchResult: "0 threads matched", threadResult: "(unused)" }),
    })
    const out = await graph.invoke({ messages: [new HumanMessage("candidate: ... digest: (none)")] })
    expect(finalizeArgs(out)?.surface).toBe(false)
  })

  it("threads org/user (no thread_id) into the email tools' config", async () => {
    let captured: unknown
    const capturingSearch = tool(
      async (_args, config) => { captured = config; return "0 threads matched" },
      { name: "search_email", description: "fake", schema: z.object({ query: z.string() }) },
    )
    const readEmailThread = tool(async () => "(unused)", {
      name: "read_email_thread", description: "fake", schema: z.object({ conversation_id: z.string() }),
    })
    const graph = buildGlanceAnalystGraph({
      model: scriptedModel({ surface: false }) as never,
      tools: [capturingSearch, readEmailThread, finalizeGlanceTool],
    })
    await graph.invoke(
      { messages: [new HumanMessage("candidate")] },
      { configurable: { organization_id: "org_1", user_id: "user_1" } },
    )
    const cfg = captured as { configurable?: Record }
    expect(cfg.configurable?.organization_id).toBe("org_1")
    expect(cfg.configurable?.user_id).toBe("user_1")
    expect(cfg.configurable?.thread_id).toBeUndefined()
  })
})

[ ] Step 3: Run to verify it fails → FAIL (cannot resolve ./graph).

[ ] Step 4: Write the analyst graph (graph.ts) — mirrors screen-companion/graph.ts's llmCall ↔ toolNode → finalize loop, terminates on finalize_glance, IGNORES the checkpointer (thread-less, like proactive-glance), default tools [searchEmailTool, readEmailThreadTool, finalizeGlanceTool], default model Sonnet:

import type { BaseChatModel } from "@langchain/core/language_models/chat_models"
import { AIMessage, SystemMessage, ToolMessage } from "@langchain/core/messages"
import type { StructuredToolInterface } from "@langchain/core/tools"
import type { BaseCheckpointSaver } from "@langchain/langgraph"
import { END, START, StateGraph, type ConditionalEdgeRouter, type GraphNode } from "@langchain/langgraph"

import { createModel } from "../../model"
import { AgentState } from "../../state"
import { searchEmailTool } from "../../tools/search-email"
import { readEmailThreadTool } from "../../tools/read-email-thread"
import { finalizeGlanceTool, FINALIZE_GLANCE_TOOL_NAME } from "../../tools/finalize-glance"
import { GLANCE_ANALYST_PROMPT } from "./prompt"

export interface GlanceAnalystDeps {
  /** Accepted by getAgent but DELIBERATELY IGNORED — un-checkpointed (thread-less,
   * like proactive-glance; a saver write needs a thread_id we don't have). */
  checkpointer?: BaseCheckpointSaver
  model?: BaseChatModel
  /** Read-only email toolset + finalize_glance. NEVER a write tool. */
  tools?: StructuredToolInterface[]
}

export function buildGlanceAnalystGraph(deps: GlanceAnalystDeps = {}) {
  const model = deps.model ?? createModel("claude-sonnet-4-6")
  const tools = deps.tools ?? [searchEmailTool, readEmailThreadTool, finalizeGlanceTool]
  const byName: Record = Object.fromEntries(
    tools.map((t) => [t.name, t]),
  )

  const llmCall: GraphNode = async (state) => {
    const modelWithTools = model.bindTools?.(tools) ?? model
    const response = await modelWithTools.invoke([
      new SystemMessage(GLANCE_ANALYST_PROMPT),
      ...state.messages,
    ])
    return { messages: [response] }
  }

  // CRITICAL (dual review): pass the node `config` (the RunnableConfig with
  // configurable.organization_id/user_id) into each tool.invoke — otherwise
  // search_email/read_email_thread read undefined org/user and return "no
  // organization in context", and the analyst reads only error strings.
  const runToolCalls = async (
    state: typeof AgentState.State,
    config: unknown,
  ): Promise => {
    const last = state.messages.at(-1)
    if (last == null || !AIMessage.isInstance(last)) return []
    const results: ToolMessage[] = []
    for (const call of last.tool_calls ?? []) {
      const t = byName[call.name]
      if (t == null) {
        results.push(new ToolMessage({
          content: `Error: unknown tool "${call.name}".`,
          tool_call_id: call.id ?? "", name: call.name, status: "error",
        }))
        continue
      }
      results.push(await t.invoke(call, config as never))
    }
    return results
  }

  const toolNode: GraphNode = async (state, config) => ({ messages: await runToolCalls(state, config) })
  const finalizeNode: GraphNode = async (state, config) => ({ messages: await runToolCalls(state, config) })

  const route: ConditionalEdgeRouter, "toolNode" | "finalizeNode"> = (state) => {
    const last = state.messages.at(-1)
    if (last == null || !AIMessage.isInstance(last)) return END
    const calls = last.tool_calls ?? []
    if (calls.length === 0) return END
    const isFinalize = calls.length === 1 && calls[0]?.name === FINALIZE_GLANCE_TOOL_NAME
    return isFinalize ? "finalizeNode" : "toolNode"
  }

  return new StateGraph(AgentState)
    .addNode("llmCall", llmCall)
    .addNode("toolNode", toolNode)
    .addNode("finalizeNode", finalizeNode)
    .addEdge(START, "llmCall")
    .addConditionalEdges("llmCall", route, ["toolNode", "finalizeNode", END])
    .addEdge("toolNode", "llmCall")
    .addEdge("finalizeNode", END)
    .compile()
}

[ ] Step 5: Run to verify it passes → PASS (2 tests).

[ ] Step 6: Typecheck + commit.

git add packages/agent-runtime/src/agents/glance-analyst/prompt.ts packages/agent-runtime/src/agents/glance-analyst/graph.ts packages/agent-runtime/src/agents/glance-analyst/graph.test.ts
git commit -m "feat(glance): glance-analyst graph — read the thread, finalize surface true/false"

Task 5: Register the glance-analyst agent

Files: Create agents/glance-analyst/index.ts + index.test.ts; Modify runtime.ts, src/index.ts.

[ ] Step 1: Write the agent index (model on agents/proactive-glance/index.ts):

import type { AgentDefinition } from "../../registry"
import { registerAgent } from "../../registry"
import { buildGlanceAnalystGraph } from "./graph"

export const glanceAnalyst: AgentDefinition = {
  id: "glance-analyst",
  name: "Glance analyst",
  description:
    "Stage-2 proactive analyst — reads the real email thread and proposes the right action (or stays silent).",
  // Model selected by getAgent via opts.modelId (Sonnet). Toolset is the graph's
  // default read-only email set — NEVER defaultTools (it has write tools).
  // Checkpointer intentionally NOT passed — the graph compiles un-checkpointed.
  buildGraph: (deps) => buildGlanceAnalystGraph({ model: deps.model }),
}
registerAgent(glanceAnalyst)

export { buildGlanceAnalystGraph, type GlanceAnalystDeps } from "./graph"
export { GLANCE_ANALYST_PROMPT } from "./prompt"

[ ] Step 2: Side-effect import. In runtime.ts (or wherever import "./agents/proactive-glance" lives — grep for it), add import "./agents/glance-analyst".

[ ] Step 3: Export from the package index. In src/index.ts, after the buildProactiveGlanceGraph export, add export { buildGlanceAnalystGraph, type GlanceAnalystDeps } from "./agents/glance-analyst".

[ ] Step 4: Write the registration test (index.test.ts). NOTE (dual review): (1) getAgentDefinition throws on an unknown id (it never returns null), so do NOT use optional chaining; listAgents() returns AgentDefinition[] (not id strings) — map to ids. (2) import "./index" registers the agent, which imports ./graph → the real searchEmailTool/readEmailThreadTool@workspace/db (throws env-less) — so MOCK those tool modules first, same as graph.test.ts. Match the existing registry.test.ts idiom:

import { describe, expect, it, vi } from "vitest"
vi.mock("../../tools/search-email", () => ({ searchEmailTool: { name: "search_email" } }))
vi.mock("../../tools/read-email-thread", () => ({ readEmailThreadTool: { name: "read_email_thread" } }))
import { listAgents, getAgentDefinition } from "../../registry"
import "./index"

describe("glance-analyst registration", () => {
  it("registers under id glance-analyst", () => {
    expect(listAgents().map((a) => a.id)).toContain("glance-analyst")
    expect(getAgentDefinition("glance-analyst").id).toBe("glance-analyst")
  })
})

[ ] Step 5: Run the registration test → PASS.

[ ] Step 6: Full agent-runtime suite + typecheck (export surface changed): cd packages/agent-runtime && bun run test && bun run typecheck → PASS.

[ ] Step 7: Commit.

git add packages/agent-runtime/src/agents/glance-analyst/index.ts packages/agent-runtime/src/agents/glance-analyst/index.test.ts packages/agent-runtime/src/runtime.ts packages/agent-runtime/src/index.ts
git commit -m "feat(glance): register glance-analyst agent (Sonnet, read-only email tools)"

Task 6: Orchestrate the two stages in runProactiveGlance

Files: Modify packages/chat-runtime/src/glance.ts + glance.test.ts.

[ ] Step 1: Read the current orchestrator + its test. The rewrite keeps the same RunProactiveGlanceInput/ProactiveGlanceResult types (already has organizationId/userId) and reuses the getAgentImpl seam; it adds a listThreadDigestImpl seam.

[ ] Step 2a: Fix the test mocks FIRST (BLOCKER — dual review). The rewritten glance.ts statically imports listThreadDigest from ./threads, which pulls in @workspace/db — and packages/db/src/index.ts throws at module load when DATABASE_URL is unset. So importing runProactiveGlance in the test crashes at import time (NOT the intended TDD red) unless the test mocks ./threads AND the agent-runtime mock exposes the new exports the rewrite imports. The existing glance.test.ts only does vi.mock("@workspace/agent-runtime", …) with no ./threads mock. Add/extend both (same pattern run.test.ts and interrupt.test.ts already use):

// Extend the existing vi.mock("@workspace/agent-runtime", () => ({ ... })) to ALSO export:
  harvestFinalizeGlanceTriage: vi.fn(() => null), // overridden per-test where needed
  GLANCE_ANALYST_RECURSION_LIMIT: 8,
// and ADD a threads mock so the static import of ./threads never loads @workspace/db:
vi.mock("./threads", () => ({
  listThreadDigest: vi.fn(async () => []),
}))

For the tests that drive the triage decision, override harvestFinalizeGlanceTriage per-test (e.g. vi.mocked(harvestFinalizeGlanceTriage).mockReturnValue({ digIn: true, candidate: { query: "SNDA Marcy" } })) — OR, simpler, have the fake triage graph's messages carry a real finalize_glance_triage tool call and let the REAL harvest run (import the real harvest in the mock via vi.importActual if you prefer). Pick one and be consistent; the four cases below assume the harvest returns the triage you set.

[ ] Step 2b: Write the failing orchestration tests — four cases: (a) triage digIn:false → analyst NOT invoked, surface:false; (b) triage digIn:true+query → analyst invoked with {configurable:{organization_id, user_id}} and no thread_id and a recursionLimit > 0 (assert the captured config), returns its decision; (c) triage digIn:true but empty query → surface:false (no blind search); (d) analyst throws → surface:false. Use fake triage/analyst graphs via getAgentImpl; the analyst fake records the config it was invoked with.

[ ] Step 3: Run to verify they fail → FAIL for the BEHAVIORAL reason (no 2-stage logic), NOT an import crash. If you see "@workspace/db … is not set" or "harvestFinalizeGlanceTriage is not a function", Step 2a is incomplete — fix the mocks before proceeding.

[ ] Step 4: Rewrite runProactiveGlance as the 2-stage orchestrator:

import {
  buildHumanMessageFromTrigger, getAgent, harvestFinalizeGlance,
  harvestFinalizeGlanceTriage, GLANCE_ANALYST_RECURSION_LIMIT,
} from "@workspace/agent-runtime"
import { listThreadDigest } from "./threads"

export interface RunProactiveGlanceInput {
  organizationId: string
  userId: string
  screenshot: string
  foregroundApp: string
  getAgentImpl?: typeof getAgent
  listThreadDigestImpl?: typeof listThreadDigest
}
export interface ProactiveGlanceResult {
  surface: boolean
  message?: string
  actionPrompt?: string
}

const ANALYST_TIMEOUT_MS = 10_000
function rejectAfter(ms: number): Promise {
  return new Promise((_r, reject) => { setTimeout(() => reject(new Error("glance_analyst_timeout")), ms) })
}

export async function runProactiveGlance(input: RunProactiveGlanceInput): Promise {
  const getAgentFn = input.getAgentImpl ?? getAgent
  const listDigestFn = input.listThreadDigestImpl ?? listThreadDigest

  // STAGE 1: triage (screenshot only, no tools)
  let triage
  try {
    const triageGraph = await getAgentFn("proactive-glance")
    const triageMessage = buildHumanMessageFromTrigger({
      request: `The lawyer just switched into "${input.foregroundApp}". Here is their screen — is it an inbound email worth reading the thread for? If so, extract its subject, sender, and a search query.`,
      screenshot: input.screenshot,
    })
    const triageOut = await triageGraph.invoke({ messages: [triageMessage] })
    triage = harvestFinalizeGlanceTriage(triageOut.messages)
  } catch { return { surface: false } }

  const query = triage?.candidate?.query?.trim()
  if (!triage?.digIn || !query) return { surface: false }

  // STAGE 2: analyst (search + read the real thread)
  try {
    const digest = await listDigestFn(input.organizationId, input.userId, 10)
    const digestLines = digest.map((d) => `- ${d.title} | ${d.summary ?? "(no summary)"}`).join("\n")
    const analystGraph = await getAgentFn("glance-analyst", { modelId: "claude-sonnet-4-6" })
    const analystMessage = buildHumanMessageFromTrigger({
      request:
        `Email candidate from the screen: subject="${triage.candidate?.subject ?? ""}", ` +
        `sender="${triage.candidate?.sender ?? ""}", search query="${query}".\n\n` +
        `Recent North conversations (most recent first):\n${digestLines || "(none)"}\n\n` +
        `Search for this thread, read it, and decide whether the lawyer needs to act.`,
      screenshot: input.screenshot,
    })
    const analystOut = await Promise.race([
      analystGraph.invoke({ messages: [analystMessage] }, {
        configurable: { organization_id: input.organizationId, user_id: input.userId },
        recursionLimit: GLANCE_ANALYST_RECURSION_LIMIT,
      }),
      rejectAfter(ANALYST_TIMEOUT_MS),
    ])
    const glance = harvestFinalizeGlance(analystOut.messages)
    if (!glance || !glance.surface || !glance.message) return { surface: false }
    return { surface: true, message: glance.message, actionPrompt: glance.actionPrompt ?? glance.message }
  } catch { return { surface: false } }
}

Note: if the real getAgent does not take an opts arg, drop { modelId } — the registered analyst default is already Sonnet (Task 5). Verify against the real type.

[ ] Step 5: Run the orchestration tests + full chat-runtime suite → PASS.

[ ] Step 6: Typecheck + commit.

git add packages/chat-runtime/src/glance.ts packages/chat-runtime/src/glance.test.ts
git commit -m "feat(glance): runProactiveGlance orchestrates triage → analyst (email v1)"

Task 7: Keep the gateway decision log accurate

Files: Modify apps/clicky-gateway/src/routes/router.ts (only if needed).

[ ] Step 1: Find the proactive-glance decision: surface=... log in the /proactive-glance handler. It logs the final runProactiveGlance result, whose shape is unchanged — so it stays correct. Per-stage attribution lives inside runProactiveGlance; do not leak internal shape here. If the log already prints surface + a truncated message, leave it as-is (no change, no commit). Per-stage observability is a follow-up.

Task 8: Full verification + push + deploy

[ ] Step 1: Root typecheck + lint. bun run typecheck (51/51; ignore only the pre-existing @workspace/simulator document-parsing error if it appears) and bun run lint (0 errors).

[ ] Step 2: Touched-package suites. cd packages/agent-runtime && bun run test, cd packages/chat-runtime && bun run test, cd apps/clicky-gateway && bun run test → all PASS.

[ ] Step 3: Push. git pull --rebase --autostash origin main && git push origin main.

[ ] Step 4: Deploy to cosmic (backend) + verify BY COMMIT (not just health), then live-test: an email needing a reply → precise nudge; an FYI/already-answered mail → NO nudge (the reported bug); a non-email screen → no analyst run (digIn:false in /tmp/north-os-dev.log, no glance-analyst activity). Confirm the analyst's search_email/read_email_thread ran with org/user and no thread_id.

Self-Review (resolved)