M MemberIntel KB
Activity Decisions

decision

ADR-0031: `question_type` content-shape facet

ADR-0031 (Accepted, 2026-07-03): `question_type` content-shape facet.

Status: Accepted
Date: 2026-07-03
Deciders: Seth Shoultes, Omar ElHawary

Context

Playbook coverage analysis (innovations#197) needs a way to see what shape of
operator question
the brain answers well vs. poorly — procedural how-tos,
should-I-X-or-Y decisions, or explanations of a mechanism. The ADR-0028 taxonomy
covers who content is for (industry, archetype, business_model, scale, platform),
not what shape it takes. Without a shape facet, coverage reporting can’t
distinguish “we have 40 pricing plays, all how-tos, zero decision framings” from
“we have 40 pricing plays across all shapes.”

Two constraints from Seth in #197:

  1. This is a content-shape facet for coverage/authoring/generator-targeting — it
    is NOT a retrieval-boost axis in V1. Embeddings already capture question shape via
    phrasing; adding a boost would mis-serve when operator phrasing doesn’t match the
    authored shape. Keep it out of the retrieval eval and the counsel gate.
  2. ADR-0028 is retrieval/applicability-specific and shouldn’t absorb an axis that
    deliberately isn’t part of retrieval. A standalone ADR keeps the boundary clean.

Decision

Add a question_type axis with three values — how-to, decision, concept
modelled mechanically on platform (PR #287) but semantically on scale
(stored-only, no retrieval boost, no all sentinel, no agnostic gate).

  • Vocab: QUESTION_TYPES = frozenset({"how-to", "decision", "concept"}) in
    src/memberintel/taxonomy.py.
  • Registered in APPLICABILITY_VOCAB so the existing ingest validator applies
    (unknown-value warn-not-fail). Safe because the only consumer of
    APPLICABILITY_VOCAB is the ingest validator — profile_boost.py hardcodes its
    boost axes (archetype, business_model) and never reads that dict. Stored
    question_type is inert at retrieval time ($0/query).
  • NOT added to _ALL_ALLOWED — no all sentinel; a content shape isn’t universal.
  • NO review-gate / VOICE / prompt / eval-fixture changes; ADR-0028’s
    agnostic-gate logic is untouched.
  • Added to _METADATA_SYNC_KEYS in ingest_core.py (after platform). Relies on
    the existing _norm_meta list-normalization from #287 so missing-key ↔ empty-list
    is treated as in-sync (so entries that omit question_type don’t churn on the
    first ingest after the key lands). Note: adding a non-empty question_type to
    existing playbooks is still an UPDATE, and the current ingest script re-embeds
    on UPDATE — a follow-up to skip re-embed on metadata-only diffs is tracked
    separately.
  • Multi-valued list[str] for mechanical consistency with the other _applicability
    axes.

Consequences

Positive:

  • Unlocks the coverage report (innovations#197 part b) — the report can pivot
    playbook counts by question shape and expose gaps.
  • Zero retrieval cost / zero eval churn — the axis is stored and never boosted.
  • Reuses the ingest validator; no new normalization code.

Negative / costs:

  • 68 existing playbooks need one-time classification (content judgment, owned by
    Brain Content Lead — not auto-tagged).
  • One more optional frontmatter field for authors to learn.

Mitigations:

  • Regression test in test_ingest_core.py proves the _norm_meta sync-key path —
    missing-key on stored metadata compares equal to [] on incoming. Zero re-embeds
    during backfill depends on the ingest script skipping re-embedding for metadata-only
    updates (tracked separately); currently the script re-embeds on UPDATE. A --dry-run
    confirms the diffs are metadata-only.
  • README schema example + rubric documented so authors can self-classify going
    forward.

Alternatives considered

  • Fold into an ADR-0028 amendment — rejected. ADR-0028 is
    retrieval/applicability-specific; question_type deliberately isn’t part of
    retrieval. Mixing them muddies the boundary the amendment was meant to keep clean.
  • Make it a retrieval boost axis — rejected per #197. Question shape is already
    captured by embeddings via operator phrasing; a boost would mis-serve when the
    operator’s phrasing doesn’t match the authored shape, and it would drag the axis
    into the retrieval eval + counsel gate for no clear win.
  • Separate CONTENT_FACET_VOCAB dict — rejected as premature. scale already
    sets the “stored in APPLICABILITY_VOCAB, not boosted” precedent; a second dict
    would be structure without a second consumer.
  • Single-valued str field — rejected for mechanical consistency. Every other
    _applicability axis is list[str]; diverging here would complicate the ingest
    parser and metadata schema for no gain.
For: S Seth Shoultes A AI Engineer B Blair Williams S Santiago Perez Asis P Product Lead