Validation loop - metalworks

Find out whether your idea is worth building — and if not, where to aim instead. A demand report tells you what people want. The validation loop goes further: it weighs that demand against what people can already get, and gives you a straight verdict — GO, PIVOT, or NO-GO — with every claim backed by a real quote. On a PIVOT it hands you the under-served angle to try next, and loops.

ideate  →  demand  +  landscape  →  assess (GO / PIVOT / NO-GO)
  ▲                                        │
  └──────────────  PIVOT  ─────────────────┘

The verdict is computed deterministically from the evidence (demand strength vs. how crowded the landscape is), so it’s defensible and reproducible — the model only writes the explanation, never the decision.

Two ways in

Idea-first — you have an idea (“a jitter-free focus app for devs”). metalworks sharpens it into a testable hypothesis and runs the loop.
Evidence-first — you have a space, not an idea (“what should I build for night-shift nurses?”). metalworks surfaces the real pains as candidate ideas, each grounded in a complaint, and you pick one.

Run the whole loop

One call runs it end to end (headless, auto-deciding at each gate using the computed verdict):

Validate this idea end to end: "a jitter-free focus supplement for developers"

In Claude Code the loop is interactive — it pauses at each verdict and you make the GO / PIVOT / NO-GO call (you have context the corpus doesn’t). The Python and CLI forms run --auto, taking the computed recommendation at each gate.

Or drive the stages yourself

The loop is just four composable primitives. Run them one at a time when you want control:

1. Ideate — frame the idea

I have an idea — help me sharpen it: a focus app for developers

Evidence-first instead? Surface the forks from a report and pick one:

Show me the real pains in this report and which to build for: <report_id>

2. Landscape — what already exists

The competitor map plus an empirical scan of real shipped products, and the cost of doing nothing:

What already exists for this report? <report_id>

3. Assess — the verdict

Is this worth building? Give me the verdict for <report_id>

How the verdict is decided

The decision is a gap function, not an opinion:

Demand strength — how many distinct people voiced the pain (50 voices outweighs one viral post).
Landscape saturation — how crowded the supply is (named competitors + real shipped products), held down by competitors who badly miss something (an opening).

Demand	Landscape	Verdict
moderate+	open	GO
moderate+	crowded, but an under-served fork exists	PIVOT (aimed at that fork)
thin, or crowded with no opening	—	NO-GO

One honest guardrail: if the landscape scan couldn’t fully ground (no product source configured), a hard GO is withheld — absence of evidence is not absence of competition.

Not to be confused with

metalworks discovery is a different feature — it finds Reddit threads where you can helpfully reply. The validation loop is about deciding what to build.

Demand research — the pull signal the loop builds on.
Competitors — the lean competitor map on its own.
Positioning — once you have a GO, find the angle.

​Two ways in

​Run the whole loop

​Or drive the stages yourself

​1. Ideate — frame the idea

​2. Landscape — what already exists

​3. Assess — the verdict

​How the verdict is decided

​Not to be confused with

​Next