Skip to main content
Find out whether your idea is worth building — and if not, where to aim instead. A demand report tells you what people want. The validation loop goes further: it weighs that demand against what people can already get, and gives you a straight verdict — GO, PIVOT, or NO-GO — with every claim backed by a real quote. On a PIVOT it hands you the under-served angle to try next, and loops.
ideate  →  demand  +  landscape  →  assess (GO / PIVOT / NO-GO)
  ▲                                        │
  └──────────────  PIVOT  ─────────────────┘
The verdict is computed deterministically from the evidence (demand strength vs. how crowded the landscape is), so it’s defensible and reproducible — the model only writes the explanation, never the decision.

Two ways in

  • Idea-first — you have an idea (“a jitter-free focus app for devs”). metalworks sharpens it into a testable hypothesis and runs the loop.
  • Evidence-first — you have a space, not an idea (“what should I build for night-shift nurses?”). metalworks surfaces the real pains as candidate ideas, each grounded in a complaint, and you pick one.

Run the whole loop

One call runs it end to end (headless, auto-deciding at each gate using the computed verdict):
Validate this idea end to end: "a jitter-free focus supplement for developers"
In Claude Code the loop is interactive — it pauses at each verdict and you make the GO / PIVOT / NO-GO call (you have context the corpus doesn’t). The Python and CLI forms run --auto, taking the computed recommendation at each gate.

Or drive the stages yourself

The loop is just four composable primitives. Run them one at a time when you want control:

1. Ideate — frame the idea

I have an idea — help me sharpen it: a focus app for developers
Evidence-first instead? Surface the forks from a report and pick one:
Show me the real pains in this report and which to build for: <report_id>

2. Landscape — what already exists

The competitor map plus an empirical scan of real shipped products, and the cost of doing nothing:
What already exists for this report? <report_id>

3. Assess — the verdict

Is this worth building? Give me the verdict for <report_id>

How the verdict is decided

The decision is a gap function, not an opinion:
  • Demand strength — how many distinct people voiced the pain (50 voices outweighs one viral post).
  • Landscape saturation — how crowded the supply is (named competitors + real shipped products), held down by competitors who badly miss something (an opening).
DemandLandscapeVerdict
moderate+openGO
moderate+crowded, but an under-served fork existsPIVOT (aimed at that fork)
thin, or crowded with no openingNO-GO
One honest guardrail: if the landscape scan couldn’t fully ground (no product source configured), a hard GO is withheld — absence of evidence is not absence of competition.

Not to be confused with

metalworks discovery is a different feature — it finds Reddit threads where you can helpfully reply. The validation loop is about deciding what to build.

Next