Tool Chains on Every Claim, Traces on Every Cell

/Webhound Team

The biggest change: click any claim — a report pill or a dataset cell — and you can now see the exact tool calls that produced it. The searches the agent ran, the pages it visited, the code it executed, and what came back. Not a summary. The real tool_use and tool_result pairs, rendered the same way the Activity tab renders them.

This is the deepest audit trail Webhound has had. Every fact is now one click away from the raw work that produced it.

Dataset cells also now carry the same claim trace structure reports already had — evidence, method, confidence. More on that below.

Tool chains on every claim

Click "View tool chain →" in the top-right of any trace popover. A modal opens with each cited tool call rendered in full. Searches show the query and the results. Page visits show the URL and the page content. Code executions show the source and the output.

Every tool call in a session gets a short, stable alias — tc_754, tc_755 — that is permanent and globally unique per session. Multiple enrichment agents running in parallel share a single namespace via atomic allocation, so tc_754 is always tc_754, no matter who called it.

The UI works the same everywhere: a research report's inline pill, a dataset cell's source icon, a published report viewed by an anonymous reader — all of them open the same tool chain modal on the same claims.

Traces on every dataset cell

Research claims have had structured traces for a while. Datasets were the blind spot — a first_name cell saying "Alex" had no way to show you where that came from. Fixed now. Every non-empty cell carries:

  • Evidence. The verbatim passage from the source that proves the value. Not a summary — the literal text.
  • Method. One short line: how the value was found. "Searched 'alex@maccaw.org' → visited maccaw.com → personal site shows 'Alex MacCaw'."
  • Confidence. High, medium, or low. High means directly stated in the primary source. Low means weak support — a cell colored by its own honesty.
  • Sources. Web URLs the value came from.
  • Who found it. The executor (during extraction) or an enrichment agent (during fill).

A small icon appears next to any cell that has a trace. Click it and a compact popover shows everything in one place — plus the "View tool chain →" link described above. The old affordance — a colored underline with a hover tooltip that overflowed the column — is gone.

Sentinels for non-web origins

When a value didn't come from a web page, the agent used to invent a fake URL like https://internal.webhound.ai/user_query. That's dishonest and the UI couldn't link to it anyway.

The new source protocol uses sentinels instead:

  • user:query — value came from the user's original prompt.
  • user:file:<name> — from an uploaded file.
  • user:session:<id> — from an attached prior session.
  • agent:code — computed by run_code.
  • agent:inference — reasoned from other cells on the same row.

The UI renders these as labeled badges — "User provided", "Computed" — not broken links.

Every step in the chain, not just the last one

If run_code computed a CAGR from revenue numbers pulled out of two different annual reports, the tool chain shows all three calls — the two visit_pdf reads AND the code execution.

A calculation is only auditable if you can see the raw inputs that went in, not just the arithmetic that came out. The claim's source_urls mirror the chain too: both original pages show up alongside agent:code.

No fabricated citations

An early failure mode: the agent would cite a plausible-looking tc_N it hadn't actually run. The tool chain modal would then open and show the wrong tools.

Every citation now passes a scope check at persist time. The executor can only cite session-message tool calls. An enrichment agent can only cite tool calls it personally made (not the planner, not the extractor, not another enrichment worker running in parallel). Out-of-scope citations get silently dropped — the cell shows no tool chain rather than a wrong one.

The verifier sees the tool-chain refs too, and is instructed to cross-check the quoted evidence against the actual tool output. If a claim says the page contained X and the cited tool result doesn't mention X, that's a flaggable hallucination. Verification gets a concrete, mechanical check to run.

Published reports keep the chain

When you publish a report, the full tool call content — tool_use inputs, tool_result outputs, timestamps — is baked into the snapshot. Public viewers open the same tool chain modal on the same claims without needing any access to the underlying session. Forked publications inherit it.

Other dataset changes

  • Add Column is back in the toolbar. For extraction sessions it now sits next to Share / Export / Prompt where it belongs, instead of being buried.
  • Enrichment got the full web toolset. Enrichment agents can now use the same tools research reports use — the 15+ platform scrapers (Google Maps, Amazon, Zillow, Indeed, Crunchbase, PitchBook, and more), Reddit page visits, image search, cross-session lookup, and grep-over-cache on previously visited pages. Before, enrichment was limited to plain search and visit. That was always arbitrary; it's fixed.
  • Full Refresh is gone. A legacy button that triggered a separate V1 extraction pipeline. The modern flow — extract more, enrich cells, fill columns — covers the same ground better.

Questions? Email team@webhound.ai.