reference

Privacy Strategy for Counsel — Memo to Allen

Privacy strategy memo for Allen (Blair's privacy counsel) ahead of the late-May architecture review. Identifies the structural problem (the architecture treats member data as operator data throughout) and proposes seven preconditions before ToS drafting — four code-layer architectural preconditions and three policy items. Drafted 2026-05-12 through a Brandeis-counsel reasoning pass, McPhee-prose rewrite, panel review (Lessig + Sunstein + Rawls), and a Lessig/Rawls debate on permanent exclusion.

Craft register only — not legal advice. This memo is a reasoning exercise in the voice of Louis D. Brandeis, rewritten for clarity in the prose register of a working explanation rather than a legal-philosophical argument. It is not a representation by counsel, does not constitute a legal opinion, and does not create an attorney-client relationship. For matters with real legal stakes, retain an attorney admitted in the relevant jurisdiction. Allen (Blair’s privacy counsel — surname TBC) is the counsel of record for this project.

A note on the prose: the substance below is Brandeis’s — the structural problem, the seven gaps, the distinction between what the architecture does for the operator and what it does not yet do for the member. The sentences are a writer’s pass in the interest of working clarity. This is v2, revised after a panel review (Lessig, Sunstein, Rawls) and a subsequent debate between Lessig and Rawls on the permanent-exclusion question. Source drafts at counsel/memos/, counsel/reviews/, and counsel/debates/.

Question

Consider the members behind an operator’s site. An oncologist builds a CME practice around MemberPress — physicians subscribe, pay, log in, access content. They have no contract with Caseproof. They do not know MemberIntel exists. When that operator connects, each member’s subscription activity, transaction history, content-access patterns, and behavioral signals flow into MemberIntel’s infrastructure under the operator’s consent, not theirs.

The question this memo addresses is whether the architecture, as currently designed, is durably defensible. Not merely compliant with the current text of GDPR or CCPA — those are the floor. Durably defensible against the foundational claim that a person’s information about themselves belongs to that person, and that every institution holding it stands in some obligation to the person, not only to whoever paid the institution.

This memo examines where the architecture holds, where it does not, and what must change.

Relevant facts

Two tiers, two models. Free tier: Anthropic Claude Haiku, with monthly snapshot data ingestion. Pro tier: Anthropic Claude Sonnet, with daily live sync. (Note: an earlier draft of this memo described Free as a locally-hosted Ollama-class model; that was revised 2026-05-20 — both tiers now run on Anthropic-hosted models. Local Ollama-class hosting is preserved as a future cost-management option, not the V1 launch choice.) Both tiers ingest per-member data; they differ in cadence and model quality, not in the category of data they touch.
Two brains, structurally isolated from each other. The per-customer brain is scoped to one tenant via Row Level Security. The global brain is shared across all customers. The one intentional crossing of this boundary is the cross-pollination pipeline.
Data sources include per-member records. The MemberPress MCP surfaces members, subscriptions, transactions, and content-access rules. Stripe adds payment history. V2 adds BuddyBoss community-engagement signals — post cadence, 7-day engagement rates, group health scores. These are behavioral signals with no precedent in V1 and no documented sensitivity classification.
Cross-pollination: the central pipeline under review. A scheduled job reads per-customer brain entries with positive feedback signals, drafts anonymized candidate global-brain entries via Claude, and queues them for a content-lead reviewer (Omar from support — superseding the 2026-05-11 plan to staff Sam from support; same internal-promotion path). A k-anonymity floor of N=5 customers is proposed. The three-roles model and staging table are specified in the architecture docs but not confirmed as implemented.
Consent at signup. A frictionless toggle in MemberPress admin is the primary onboarding path. The spec resolution is explicit: frictionless data ingestion is the top onboarding priority. Consent language is still owed to Allen. The members behind the operator’s site are not party to the agreement.
Deletion. Account deletion removes per-customer secrets, rows, and audit-log values. Cross-pollinated entries already in the global brain do not retroactively pull. This asymmetry is planned for ToS disclosure but not yet drafted.
Retention defaults. Brain entries carry version and superseded_at columns. Retention for superseded versions is undocumented; the default is indefinite.
No training on customer data. Both specs state this explicitly. Whether Anthropic’s API has been configured to opt out of training-data use, and whether the free-tier model has any call-home or telemetry behavior, is not yet documented.
Hosting. GCP: Cloud Run, Cloud SQL, Secret Manager. CMEK encryption is recommended in the architecture docs but described as a Terraform configuration to-do, not a confirmed deployed fact.

Applicable framework

The foundation is not the GDPR article or the CCPA section. Those set the floor. The foundation is the proposition Warren and Brandeis advanced in 1890: the right to be let alone is the right to control the conditions under which others learn things about you. The obligation runs to the person, not to whoever paid the institution.

Alongside that, the relevant concept is contextual integrity: information flows appropriately when they match the norms of the context in which the information was originally shared. A physician who joins a urology CME platform shares subscription status in a medical-professional context. She does not share it in a context that includes a third-party SaaS advisor analyzing her patterns alongside a fitness community and a genealogy club. The contextual mismatch is the injury, whether or not any individual field is technically identifiable in isolation.

The third element is the non-party problem. The members are not parties to the contract between the operator and Caseproof. The platform knows about them; they do not know about the platform. The law of 2026 does not fully resolve this asymmetry. The architecture must anticipate it — because when the law catches up, the question will be whether the design anticipated the obligation or merely the current regulation.

Analysis

The structural problem

The architecture treats member data as operator data throughout.

The per-customer brain accumulates observations derived from member behavior — churn patterns, pricing sensitivity, engagement signals. The cross-pollination pipeline extracts patterns from that brain and promotes them into a global brain available to every customer. At no stage does a member have visibility into this chain, an opportunity to object, or a deletion right that runs to MemberIntel directly. They can ask the operator to disconnect. That is a chain of intermediaries, not a right.

Everything that follows — every gap identified in this memo, every precondition required before ToS drafting — is a corollary of this structural fact. The architecture is not missing consent language. It is missing a layer of protection that runs to the member, not to the operator. The consent language, when it arrives, will document an arrangement. The architecture must be what makes that arrangement defensible.

This is the standard GDPR Article 28 processor/controller arrangement. It is legally common. It is not the same thing as structurally defensible.

The member as invisible third party

MemberIntel’s direct customer is the operator. The data source is the operator’s members. The ToS binds the operator. The member data flows in under the operator’s consent, not the members’ own.

The system does not screen operators for the sensitivity of their membership context. An operator running a transgender-care provider’s member community, a domestic-violence shelter network, a 12-step recovery group, or an oncology CME practice is treated identically to an operator running a fitness class. The cross-pollination pipeline does not distinguish. The content-lead reviewer — Omar from support, with no documented privacy training — must distinguish, alone, for a customer base expected to reach 50,000.

That is not a privacy control. It is a liability shield that looks like one. A single-reviewer model with no documented rejection-rate SLA concentrates the judgment about the most vulnerable membership contexts in the person with the least institutional support for making that judgment correctly. This places the burden of protecting the least-advantaged members on the least-equipped actor in the chain.

The right to be let alone does not weaken because the violation was accidental and the reviewer was conscientious.

Re-identification and the k=5 floor

Sweeney’s empirical result — that three quasi-identifiers are routinely sufficient for identification — has not been overturned. For specialty niches, there may not be five comparable operators in the entire MemberIntel customer base. Five urology CME platforms does not anonymize the fact that one of them raised prices and lost 15% of members in a month, if the other four are large programs and the affected one is the only solo-practice site of its size. The k=5 floor is a floor beneath which a pattern is ineligible. It is not a guarantee that a pattern at k=5 is non-identifying.

The architecture document proposes the correct intervention: pre-process entries to extract only safe-to-generalize facts before Claude drafts candidates; never expose the raw customer brain text to the drafting prompt. Three things are missing: a documented taxonomy of what “safe to generalize” means; an automated screening step that flags sensitive-niche operators before they reach the candidate pool; and a context-sensitive k-floor that adjusts upward for sensitive categories. The N=5 proposal is also a proposal, not an enforced database constraint. These gaps are not editorial. They are the structural elements on which every privacy claim about cross-pollination rests.

Sensitive-context inheritance and the opt-out problem

Cross-pollination contribution is opt-out rather than opt-in. The flywheel economics make this understandable. It is not an acceptable default for the operator running a domestic-violence shelter’s membership records.

That operator is a small-business owner who clicked a toggle in their WordPress admin to get a free advisor. They are not reading the cross-pollination clause in the ToS with a law firm’s attention. The frictionless-first design priority, as a product mandate, reflects a choice about whose interests the onboarding is designed to serve: operator conversion. That is a legitimate goal. It is not a neutral one. And once the toggle ships with placeholder language, the organizational default will be inertia — nobody pulls the consent review upstream after launch. The sequencing is not a timing error. It is a structural lock-in.

For the highest-sensitivity membership contexts — domestic-violence shelters, addiction recovery programs, transgender-care providers, oncology and other sensitive medical specialty practices — the appropriate design is not opt-in defaults. It is permanent exclusion from cross-pollination, implemented as a non-overridable database flag at site-profile classification. The classifier runs automatically at profile creation using the public-site-analysis pipeline already in the V1 spec. This is technically feasible now. Not doing it before launch is not a timing choice. It is a structural defect.

The per-customer brain, under this design, remains fully available for excluded contexts. The operator gets the full product value — chat, insight cards, advisor recommendations — because the per-customer brain is bounded, accountable, and operator-consented. What is excluded is only the promotion of patterns from that context into the global pool.

The consent language is owed to Allen and has not been drafted. The toggle must not go live before Allen has reviewed and approved both the language and the structure of the consent flow. Allen-approved language buried below a fold is theater with a sign-off. The architecture must make the frictionless path also the informed path — those are different requirements.

At minimum, the consent flow must require the operator to represent that their own privacy policy discloses the use of AI advisors who process member data. MemberIntel cannot make that representation true on the operator’s behalf. It can require the representation.

On deletion: the retroactive-non-pull asymmetry for cross-pollinated entries is defensible only if the k-anonymity floor was actually enforced at creation time. The ToS language must be exact. “Cannot be attributed” must be backed by documented technical constraints, not aspirational design. On retention: superseded versioned brain rows have no operational function. “Forever” is not a defensible default.

The no-training commitment and agent-action data

The no-training commitment for the Pro tier requires confirmation that Anthropic’s API is configured to opt out of training-data use, documented in writing. This is an architecture gap, not a documentation gap. The free-tier model selection remains open; whichever model is selected must run in a network-isolated Cloud Run environment with no egress to non-approved endpoints. If it runs on Google Cloud infrastructure, it introduces a subprocessor whose data-use terms must be reviewed. Documentation of a commitment is not a substitute for a network-level constraint.

Once V1.5 ships, per-customer brains will accumulate records of what the agent did and how it turned out — “we raised price on Tier 2 with 24 active members, here is the outcome.” This is the most specific and most identifying data the system will ever hold. Its cross-pollination eligibility must be decided in writing before V1.5 ships. The default should be exclusion.

Conclusion

The architecture is not yet durably defensible. It is technically sophisticated — the three-roles cross-pollination boundary, the CMEK encryption plan, the per-license signing keys, the RLS isolation, the audit trail, the k-anonymity floor as a floor — and these are genuine protections that should be presented to Allen as such.

But they protect the operator from the platform. The missing layer is the protection of the member from everyone.

Seven things must change before the architecture is ready for ToS drafting. Four are architectural preconditions — they require pull requests, not legal drafting, and the policy work cannot begin until they exist. Three are policy items that require Allen.

Architectural preconditions (code-layer — must land before ToS drafting)

1. Permanent exclusion of high-sensitivity contexts from cross-pollination. [code-layer]

A non-overridable database flag at site-profile classification. Categories enumerated explicitly in the policy doc — at minimum: domestic-violence shelter networks, addiction recovery programs, transgender-care providers, oncology and other sensitive medical specialty practices. The classifier runs automatically at profile creation using the public-site-analysis pipeline already in the V1 spec. The per-customer brain remains fully available for excluded operators — what is excluded is only the promotion of patterns into the global brain, regardless of anonymization quality. Both Lessig and Rawls, working from different frameworks, converge on this artifact. The framing difference is addressed in the section that follows.

2. Sensitive-context classifier with calibrated defaults for remaining categories. [code-layer]

Medical-adjacent categories default to opt-in-required cross-pollination with k≥10; ordinary categories (fitness, hobbyist, business education) get the standard k=5 floor with opt-out. The classifier sets this at profile creation, automatically, before any human review.

3. k-anonymity enforced as a database constraint. [code-layer]

The cross-pollination pipeline must refuse to promote any pattern whose source row count falls below the context-sensitive threshold. Enforcement at the query layer, not at the reviewer’s discretion. A documented taxonomy of what “safe to generalize” means — maintained as a versioned policy doc — must accompany the constraint.

4. No-training chain of custody, documented end-to-end. [code-layer]

For Pro tier: Anthropic API confirmed to opt out of training-data use, with the configuration captured in writing. For free tier: model selection resolved, deployment topology confirmed to be network-isolated Cloud Run with no egress to non-approved endpoints. Documentation of a commitment is not a substitute for a network-level constraint.

Policy items (policy-layer — require legal drafting)

5. Consent language drafted by Allen before the toggle goes live. [policy-layer]

The toggle must not ship before Allen has approved both the language and the structure of the consent flow. The frictionless-first mandate reflects a choice about whose interests onboarding is designed to serve — operator conversion — and this memo names it as a choice, not a sequencing error. Once the toggle ships with placeholder language, inertia does the rest. The consent review must happen first, or it will not happen.

6. Finite retention schedule for superseded brain-entry versions. [policy-layer]

90 days as the defensible default; active versions persist through the life of the account. “Forever” is not a defensible default for rows with no operational function after replacement.

7. Written cross-pollination policy for V1.5 agent-action data. [policy-layer]

Decided before V1.5 build kickoff, in writing. Default should be exclusion: agent-action data is the most identifying data the system will ever hold.

The architecture can be made defensible. It is not defensible today.

For Allen’s consideration when new sensitive categories arise

The panel and the debate converged on the same architectural outcome — permanent exclusion of high-sensitivity contexts, implemented as a non-overridable database flag — but the underlying frameworks differ, and the difference matters for how the exclusion list gets updated over time.

Lessig frames the exclusion as solving the re-identification and contextual-integrity problem through code. A context earns exclusion when the contextual norms under which the original disclosure occurred cannot be preserved through any code constraint, regardless of anonymization quality. The question he would ask about a new category: can the flow be made to match the original norms? If no, exclude.

Rawls frames the exclusion as solving the consent problem at the source. The wrong in cross-pollination, on his account, is inference-for-export — the migration of signal from a context in which it was generated to a context in which it was never offered — and no anonymization quality resolves a consent problem that exists prior to extraction. The question he would ask about a new category: did the source community consent to inference-for-export? If the answer is structurally no — because members of the community don’t know they’re in it, and the operator’s toggle click cannot speak for them — exclude.

For the categories now on the table, these framings produce the same answer. They may diverge for future categories: neurodivergent-support communities, sex-worker advocacy networks, or others where the contextual-norm question is more settled than the consent question, or vice versa. The exclusion list should be maintained as a named policy document, reviewable by Allen, auditable post-launch, and updated through a documented process — not implicit in the classifier’s training data where no one can inspect what it contains. When a new category is proposed for addition, note which framework supports the exclusion and whether they agree. If they disagree, bring it to Allen before the classifier is updated.

Caveats and limits

This is craft-register reasoning in the voice of Louis D. Brandeis, not legal advice. It does not constitute a legal opinion, does not create an attorney-client relationship, and does not substitute for review by Allen or another licensed attorney in the relevant jurisdiction.
This memo assumes the architecture documents are complete and accurate. Gaps between documented design and implemented code — particularly the k-anonymity enforcement, the brain-entry pre-processing step, and CMEK configuration — would change the analysis. The architecture review should include a demonstration of running code, not only design documents.
GDPR Article 28 controller/processor classification is not resolved here. The cross-pollination pipeline — where MemberIntel extracts patterns from member data for its own global brain, not for the operator’s benefit — has a strong argument for independent-controller classification. Allen must resolve this before ToS drafting can be accurate. This question deserves a separate instrument to Allen, not only a caveat.
The subprocessor chain is not fully documented. The Anthropic API, GCP services, and BuddyBoss REST API are subprocessors under GDPR Article 28. Their data-use terms and DPA availability must be reviewed before the Privacy Policy can be finalized.
The DPIA question is not addressed here. The cross-pollination pipeline and BuddyBoss community-data ingestion may trigger a Data Protection Impact Assessment requirement under GDPR Article 35. The architecture review agenda appropriately flags this; Allen should assess whether a DPIA is required.
State-law variation is not addressed. Washington’s My Health MY Data Act, Illinois BIPA, and other state-level frameworks may be relevant depending on the operator customer base and BuddyBoss data categories. Allen’s scope should include state-level exposure beyond California.
The k=5 floor is a minimum, not a sufficient protection for sensitive-context patterns. Allen and Seth should jointly determine context-sensitive thresholds and document them in ADR 0011 before the architecture review.
If the free-tier model runs on cloud infrastructure rather than locally, the training-data leakage analysis changes and a new subprocessor must be documented and reviewed.
Member notification is not addressed. Whether members are owed direct notification of MemberIntel’s existence — independent of what the operator discloses — is a question about publicity and political legitimacy that this memo defers. It should be on the V2 product agenda, in writing, before V1 ships.
BuddyBoss community-engagement data in V2 requires the same sensitive-context classification and data-flow mapping as the member-level data from V1. V2 cannot treat these signals as an incremental data source without documented sensitivity review.

Source drafts in the repo

The KB page above is the published artifact. Working drafts that produced it (audit trail, not on the KB site):

counsel/memos/privacy-strategy-for-counsel.md — Brandeis-counsel reasoning pass, McPhee-prose rewrite
counsel/reviews/privacy-strategy-for-counsel.md — Lessig + Sunstein + Rawls panel review
counsel/debates/permanent-exclusion-vs-code-constraints.md — Lessig vs Rawls two-round debate

For: S Seth Shoultes B Blair Williams S Santiago Perez Asis A AI Engineer O Omar ElHawary