decision

ADR-0031: `question_type` content-shape facet

ADR-0031 (Accepted, 2026-07-03): `question_type` content-shape facet.

Status: Accepted
Date: 2026-07-03
Deciders: Seth Shoultes, Omar ElHawary

Context

Playbook coverage analysis (innovations#197) needs a way to see what shape of
operator question the brain answers well vs. poorly — procedural how-tos,
should-I-X-or-Y decisions, or explanations of a mechanism. The ADR-0028 taxonomy
covers who content is for (industry, archetype, business_model, scale, platform),
not what shape it takes. Without a shape facet, coverage reporting can’t
distinguish “we have 40 pricing plays, all how-tos, zero decision framings” from
“we have 40 pricing plays across all shapes.”

Two constraints from Seth in #197:

This is a content-shape facet for coverage/authoring/generator-targeting — it
is NOT a retrieval-boost axis in V1. Embeddings already capture question shape via
phrasing; adding a boost would mis-serve when operator phrasing doesn’t match the
authored shape. Keep it out of the retrieval eval and the counsel gate.
ADR-0028 is retrieval/applicability-specific and shouldn’t absorb an axis that
deliberately isn’t part of retrieval. A standalone ADR keeps the boundary clean.

Decision

Add a question_type axis with three values — how-to, decision, concept —
modelled mechanically on platform (PR #287) but semantically on scale
(stored-only, no retrieval boost, no all sentinel, no agnostic gate).

Vocab: QUESTION_TYPES = frozenset({"how-to", "decision", "concept"}) in
src/memberintel/taxonomy.py.
Registered in APPLICABILITY_VOCAB so the existing ingest validator applies
(unknown-value warn-not-fail). Safe because the only consumer of
APPLICABILITY_VOCAB is the ingest validator — profile_boost.py hardcodes its
boost axes (archetype, business_model) and never reads that dict. Stored
question_type is inert at retrieval time ($0/query).
NOT added to _ALL_ALLOWED — no all sentinel; a content shape isn’t universal.
NO review-gate / VOICE / prompt / eval-fixture changes; ADR-0028’s
agnostic-gate logic is untouched.
Added to _METADATA_SYNC_KEYS in ingest_core.py (after platform). Relies on
the existing _norm_meta list-normalization from #287 so missing-key ↔ empty-list
is treated as in-sync (so entries that omit question_type don’t churn on the
first ingest after the key lands). Note: adding a non-empty question_type to
existing playbooks is still an UPDATE, and the current ingest script re-embeds
on UPDATE — a follow-up to skip re-embed on metadata-only diffs is tracked
separately.
Multi-valued list[str] for mechanical consistency with the other _applicability
axes.

Consequences

Positive:

Unlocks the coverage report (innovations#197 part b) — the report can pivot
playbook counts by question shape and expose gaps.
Zero retrieval cost / zero eval churn — the axis is stored and never boosted.
Reuses the ingest validator; no new normalization code.

Negative / costs:

68 existing playbooks need one-time classification (content judgment, owned by
Brain Content Lead — not auto-tagged).
One more optional frontmatter field for authors to learn.

Mitigations:

Regression test in test_ingest_core.py proves the _norm_meta sync-key path —
missing-key on stored metadata compares equal to [] on incoming. Zero re-embeds
during backfill depends on the ingest script skipping re-embedding for metadata-only
updates (tracked separately); currently the script re-embeds on UPDATE. A --dry-run
confirms the diffs are metadata-only.
README schema example + rubric documented so authors can self-classify going
forward.

Alternatives considered

Fold into an ADR-0028 amendment — rejected. ADR-0028 is
retrieval/applicability-specific; question_type deliberately isn’t part of
retrieval. Mixing them muddies the boundary the amendment was meant to keep clean.
Make it a retrieval boost axis — rejected per #197. Question shape is already
captured by embeddings via operator phrasing; a boost would mis-serve when the
operator’s phrasing doesn’t match the authored shape, and it would drag the axis
into the retrieval eval + counsel gate for no clear win.
Separate CONTENT_FACET_VOCAB dict — rejected as premature. scale already
sets the “stored in APPLICABILITY_VOCAB, not boosted” precedent; a second dict
would be structure without a second consumer.
Single-valued str field — rejected for mechanical consistency. Every other
_applicability axis is list[str]; diverging here would complicate the ingest
parser and metadata schema for no gain.

For: S Seth Shoultes A AI Engineer B Blair Williams S Santiago Perez Asis P Product Lead