Live demo

Lead enrichment, built two ways

Paste a LinkedIn-style profile and a company description. The app scores fit against a fixed B2B SaaS ICP and returns source-quoted claims, an outreach hook, and a routing decision: auto-add, propose, or discard.

Two builds run the same workflow against the same eval set. The integrated build uses a strict tool schema, extended thinking, and a per-claim grounding rule. The chat build uses a task-describing system prompt and nothing else. The scorecard measures the architectural gap.

Try the integrated build →Try the chat build →

What’s different

Dimension

Integrated

Chat

Output contract

Strict enrich_lead tool schema

Free-form prose

Grounding rule

Per-claim verbatim source quote

No structural requirement

Inference shape

Single structured call

Multi-turn chat + Haiku extractor

Extended thinking

4000-token budget

Off

Explore

01
Integrated build
Lead queue with structured outputs, source-quoted claims, extended thinking deltas, and an under-the-hood telemetry drawer.
Open the queue →
02
Chat build
Productized chat against the same model, without tools or schema. Conversations stay in the browser, and diagnostics live behind a toggle.
Open the chat →
03
Eval scorecard
73-item test set, both modes scored against the same gold labels, two grounding judges, and a dated pre/post-fix snapshot.
See the numbers →
04
Methodology
What's measured, how the labels were built, the criteria revision log, judge models, and the known limits to read before the numbers.
Read the methodology →

Lead enrichment, built two ways

Integrated build

Chat build

Eval scorecard

Methodology