AI PersonalizationPrivacy ArchitectureGDPRLoRAComputational IdentityAI OwnershipData Sovereignty

Who Owns Your Personalized Superintelligence? Privacy Architecture for the Next Frontier of AI

When AI learns your reasoning patterns, communication style, and expertise — who owns that personalization? It's an architectural question, not a philosophical one.

March 20, 202616 minMaryna Vyshnyvetska

Who Owns Your Personalized Superintelligence?

Privacy Architecture for the Next Frontier of AI Personalization

In February 2026, two of the most influential voices in AI independently identified the same architectural gap. At the India AI Impact Summit, Demis Hassabis described current AI systems as "jagged" and "frozen" — capable of winning gold at the International Mathematics Olympiad while failing elementary arithmetic when a question is rephrased. Three weeks earlier at Davos, Dario Amodei confirmed Anthropic is actively working on continuous learning capabilities.

Both are converging on the same destination: AI that learns continuously from interaction, adapts to individual users, and retains context across sessions.

The technical trajectory is clear. The question that remains unasked is deceptively simple: when AI becomes truly personalized — when it learns your reasoning patterns, communication style, decision-making history, and professional expertise — who owns that personalization?

This is not a philosophical question. It's an architectural one. The answer depends on where personalization lives in the system — in the model's weights, in a separate adapter, in a context injection layer, or in an encrypted enclave. Each choice carries different implications for data ownership, portability, and compliance with GDPR and HIPAA.

The Personalization Spectrum

Not all personalization is created equal. To reason clearly about privacy implications, we need to distinguish between different depths — each operating at a different layer of the AI system architecture.

Context-level personalization operates entirely within the input. The model itself is unchanged; personalization is achieved by constructing a richer prompt that includes user history, preferences, and relevant memories retrieved via RAG. This is the dominant approach today — used by ChatGPT's memory feature, Claude's memory, and custom implementations.

Attention-level personalization modifies how the model processes information. Techniques like LoRA inject small trainable matrices into the model's attention layers, altering its behavior without changing the base weights. The adapter is small (megabytes vs. gigabytes for the full model) and can theoretically be stored separately.

Weight-level personalization involves full fine-tuning on user-specific data. This produces the deepest personalization but creates the most complex ownership questions: the resulting weights are an inseparable blend of the provider's base model and the user's data.

The critical insight: personalization depth correlates directly with privacy risk. The deeper the personalization, the more tightly user data becomes entangled with the model — and the harder it becomes to extract, port, or delete.

Four Architectural Approaches

1. Encrypted LoRA in Trusted Execution Environment

A LoRA adapter is trained on user interaction data and stored on the provider's servers in encrypted form. The user holds the encryption key. At inference time, the adapter is decrypted inside a Trusted Execution Environment (TEE) — Intel SGX, AMD SEV, or AWS Nitro Enclaves — applied to the base model, and used for generation. The decrypted adapter never leaves the enclave.

Privacy: Medium. The architecture depends entirely on trust in the hardware enclave. Intel SGX has been breached through side-channel attacks (Plundervolt, SGAxe, AEPIC Leak). The provider cannot access the LoRA in theory; in practice, a motivated actor with physical hardware access might.

Performance: Excellent. Everything runs on the provider's infrastructure. LoRA overhead is typically less than 5% of inference time.

GDPR: Complex. Even data momentarily decrypted inside a TEE constitutes "processing" under Article 4(2). A DPA and lawful basis are still required, even though the provider cannot read the data.

Feasibility: Near-term (1–2 years). All components exist today. The bottleneck is standardizing TEE-based inference pipelines for consumer-scale deployment.

2. Split Inference

The model is partitioned: lower layers run on the provider's servers, upper layers run on the client's device. The provider processes input through the base layers and returns intermediate hidden states. The client applies LoRA-modified upper layers to produce the final output.

Privacy: High. The LoRA adapter never leaves the client's machine. The provider has no knowledge of how hidden states are transformed by the personalization layer.

Performance: Challenging. Each token generation requires a network round-trip. For a model with hidden dimension 8,192, each hidden state is ~16KB at FP16. At 50 tokens/second: ~800KB/s downstream traffic. Manageable on broadband, problematic on mobile. The client also needs GPU resources to run the upper layers.

GDPR: Cleanest separation of roles. The provider is a processor for base computation only. Data portability under Article 20 is inherent — the user already possesses their adapter.

Feasibility: Medium-term (2–4 years). Requires standardized split inference interfaces. Apple M-series and Qualcomm Snapdragon processors are moving in this direction.

3. Homomorphic Encryption

Fully homomorphic encryption (FHE) allows computation on encrypted data without decryption. The user encrypts their LoRA adapter, the server performs inference on the encrypted data, and returns an encrypted result only the user can decrypt.

Privacy: Maximum. Mathematically guaranteed. The provider cannot learn anything about the user's personalization or input, regardless of computational resources. This is the gold standard.

Performance: Currently impractical. FHE introduces 3–6 orders of magnitude overhead. A 50ms inference call might take minutes under FHE. Recent advances from Zama and Cornami are narrowing the gap, but real-time inference under FHE remains years away.

GDPR: Significantly simplified. The provider never processes personal data in cleartext. A DPA is technically still required — computing on encrypted personal data is still "processing" under Article 4(2) — but compliance becomes much easier given the minimized risk profile.

Feasibility: Long-term (5+ years).

4. Client-Side Context Architecture

Rather than modifying the model, this approach modifies the input. A client-side orchestration layer retrieves relevant memories, personality traits, domain context, and conversation history, and constructs a rich prompt that steers the unmodified base model toward personalized behavior.

In previous work (Beyond RAG), we described SYNAPSE — a production middleware implementing this architecture through proactive memory injection. The key distinction: SYNAPSE is not a RAG system. Where RAG gives the model a search tool and lets it decide when to retrieve, SYNAPSE intercepts every incoming message, extracts associative anchors, traverses a memory graph, and injects relevant memories directly into the context window. The model starts thinking with memories already present — it never searches for them.

Three design decisions make this privacy-relevant:

Middleware, not a tool. Memory injection is automatic and invisible to the model. The retrieval pipeline runs entirely on the client side.
Association, not similarity. Three strategies run in parallel: graph traversal (recursive CTE following semantic edges), vector search, and keyword matching. Graph traversal retrieves contextually connected memories that share no keywords or embedding similarity with the query.
Hybrid extraction with local inference. Anchor extraction uses regex patterns (~2ms) and a local 1.7B quantized model (~250ms). Message content stays on-device — no additional API call for extraction.

Privacy: User-controlled. The memory store, retrieval logic, and injection pipeline run on client infrastructure. The provider sees only the enriched prompt. The user controls what context to inject — effectively a privacy dial from zero (no personalization) to rich context (deep personalization).

Performance: Good. Limited by context window size, but frontier models now support 128K–2M tokens. Latency overhead is minimal.

GDPR: The user acts as both data controller and processor for personalization data. Article 20 portability is inherent. Article 17 erasure is simple — delete the local memory store.

Limitation: Context injection changes what the model processes, not how it processes. A LoRA-adapted model has internalized user patterns at the weight level; a context-injected model re-learns them from the prompt at every inference. The difference is studying a subject deeply versus reading notes before an exam. Both produce results; the depth differs.

Feasibility: Now. All components are production-ready: vector databases (PgVector, Pinecone), embedding models, orchestration frameworks.

This architecture addresses approximately 80% of the personalization problem with full user control. The remaining 20% is the space where Architectures 1–3 operate, each trading different degrees of privacy for deeper integration.

The Legal Question Nobody Is Asking

GDPR's Article 20 establishes the right to data portability: users can request personal data in structured, machine-readable format and transmit it elsewhere. This right was designed with databases and files in mind. It does not address what happens when personal data has been transformed into model weights through training.

Consider a concrete scenario. A user has interacted with a personalized AI for two years. Their communication patterns, domain expertise, and decision-making heuristics have been encoded into a LoRA adapter. They want to switch providers.

Under Architectures 1 or 2, the LoRA adapter is a discrete artifact that can be exported. But can it be ported? A LoRA trained on one base model is not compatible with another. The personalization is locked to the provider's architecture — technical lock-in that GDPR was not designed to address.

Under full fine-tuning, the situation is worse. The user's data is inseparable from the model weights. You cannot extract someone's personality from a neural network any more than you can extract an ingredient from a baked cake.

This creates a regulatory gap. We propose the concept of computational identity data — a new category encompassing the learned representations derived from a user's AI interactions: LoRA adapters, fine-tuned weight deltas, embedding-space representations of user behavior, and any computational artifact that encodes personal patterns.

Computational identity data has properties that distinguish it from traditional personal data:

Architectural dependency. Unlike a CSV export, computational identity data may be bound to a specific model architecture. Portability requires either standardized adapter formats or translation layers between architectures.

Emergent information. The LoRA may encode patterns the user never explicitly provided — implicit reasoning strategies, unconscious communication preferences, latent domain connections. The user may not know what their computational identity contains.

Dual provenance. Computational identity data is derived from both user data and the provider's base model. Neither party created it alone. Existing IP and data protection frameworks struggle with jointly-produced artifacts.

The EU AI Act adds another dimension. Article 15(4) requires that high-risk AI systems continuing to learn after deployment must mitigate risks of biased outputs creating feedback loops. A personalized model that has internalized a user's reasoning patterns — including their blind spots — raises an unresolved question: does the provider bear responsibility for accuracy degradation caused by user-specific adaptation?

Regulators should consider extending Article 20 to explicitly cover computational identity data, establishing standardized adapter portability formats (analogous to Open Banking for financial data), and clarifying the ownership status of jointly-produced computational artifacts.

The Race We Should Be Having

The current AI race is defined by a single metric: capability. The entire industry is sprinting toward more powerful, more general models.

There should be a parallel race: who builds the first truly private personalized AI?

Because a superintelligence that knows you intimately but belongs to a corporation is not your superintelligence. It is a corporate asset with your face.

The architectural options exist. TEE-based encrypted adapters provide near-term solutions. Split inference offers a clean separation of concerns. Homomorphic encryption promises a mathematically perfect answer. Client-side context architectures work today.

What is missing is not technology but intention. No major AI provider currently offers users genuine ownership of their personalization data. Memory features are proprietary. Fine-tuning produces locked-in adapters. The default architecture of personalized AI accumulates an increasingly detailed computational identity for each user — with no mechanism to export, audit, or truly delete it.

The organizations that solve this first — that build personalized AI with user sovereignty as an architectural principle, not a compliance afterthought — will define the next era of the industry. Not because privacy is fashionable, but because the alternative — billions of people's cognitive patterns locked inside corporate models — is a concentration of power that makes current data monopolies look quaint.

The tools exist. The frameworks exist. The regulatory foundations, imperfect as they are, exist. What remains is the will to build correctly.

This article was originally prepared for Towards AI. [Kenaz](/contact) specializes in [privacy-first AI architecture](/services/gdpr-hipaa-compliance), including client-side context systems and GDPR/HIPAA-compliant deployments. If the ownership of your AI personalization data matters to you, [let's talk](/contact).