M MemberIntel KB
Activity Decisions

decision

ADR-0006: Hive Mind as Brain Seed Source

ADR-0006 (Accepted, 2026-05-12): Hive Mind as Brain Seed Source.

Status: Accepted
Date: 2026-05-12
Deciders: Seth (Lead Architect)

Context

Slice 1 needs a global brain — a vector store of MemberPress product knowledge that the chat endpoint can cite. Building this from scratch would require curating, scraping, and embedding 45K+ docs, 70K+ code chunks, and 13K+ graph entities across 11 Caseproof products.

The Hive Mind (~/Local Sites/caseproof-agent/deploy/hive-mind-mcp/) already has all of this content indexed and accessible via an MCP endpoint (hive-mind.caseproofagent.com/mcp). It serves 48 MCP tools for querying Caseproof product knowledge.

The decision is whether to:

  1. Build MemberIntel’s brain from scratch (curate + scrape + embed)
  2. Seed from Hive Mind’s existing content (copy + re-embed with our own model)
  3. Query Hive Mind at chat-time (no local brain, real-time API calls)

Decision

Seed from Hive Mind with re-embedding. MemberIntel will:

  1. Call Hive Mind’s hive_format_context MCP tool to extract content
  2. Re-embed all extracted content with Voyage voyage-3-lite (512d) into MemberIntel’s own pgvector brain_entries table
  3. Run a nightly sync job to catch Hive Mind updates

This gives us:

  • Immediate content coverage (45K+ docs, 70K+ code chunks, 13K+ graph entities)
  • Ownership of our own embedding model (not locked to Hive Mind’s choice)
  • Low-latency search (pgvector cosine similarity, no network hop at query time)
  • Independence from Hive Mind’s uptime

Consequences

Positive:

  • Zero cold-start — the brain is populated on day one with production-quality content
  • MemberIntel controls its own embedding model; can swap Voyage for a better model later
  • Search latency is database-level (pgvector cosine similarity), not API-level
  • Nightly sync catches new Hive Mind content

Negative / costs:

  • Initial seed is a batch job that takes minutes to hours depending on content volume
  • Nightly sync adds operational complexity (needs a cron/scheduler)
  • Embedding API costs (Voyage) for the initial seed and nightly re-embeds
  • Content freshness is at most 24 hours behind Hive Mind

Mitigations:

  • The seed script (scripts/seed_brain.py) is idempotent — safe to re-run
  • Nightly sync uses incremental diffing (only embed new/changed content)
  • Voyage voyage-3-lite is cheap ($0.02/1M tokens); full re-embed is under $5
  • The tenant_id field on brain_entries keeps per-customer isolation for V2

Alternatives considered

  • Build from scratch — rejected: months of curation, no day-one content, duplicates Hive Mind’s work
  • Query Hive Mind at chat-time — rejected: adds 200-500ms latency per query, Hive Mind becomes a single point of failure, no offline capability
  • Copy Hive Mind vectors directly — rejected: different embedding model dimensions, different index structure; re-embedding is cleaner than vector translation
For: S Seth Shoultes A AI Engineer B Blair Williams S Santiago Perez Asis P Product Lead