Zero-Knowledge Boundary

What the server can and cannot observe — the structural guarantees that prevent server access to sensitive data.

The platform is built so the server has no access to sensitive document content, encryption keys, or the relationships between users, documents, and grants. This is enforced by the architecture, not by policy.

The Zero-Knowledge Promise

When we say the platform is zero-knowledge, we mean that the server cannot access the plaintext content of your documents, your search queries, your sensitive metadata, or the cryptographic associations between the people you share with. The server stores ciphertext, wrapped keys, blind tokens, and operational metadata — it cannot interpret the sensitive parts of what it handles.

This is a structural guarantee, not a configuration option:

Your encryption keys are created on your device and never leave it in unencrypted form
Sensitive plaintext processing — decryption, OCR, verification, search index generation — happens inside a hardware-isolated secure enclave that server operators cannot access
Sensitive stored relationships use blind tokens — one-way HMAC tokens replace plaintext user, document, and entity identifiers in the tables where those associations would be most sensitive

What The Server Can See

The server is not completely blind. It needs certain information to function:

A login bucket index — a 13-bit integer (0–8191) derived from your email via an Oblivious Pseudorandom Function (OPRF). The server uses this index to retrieve login candidates but cannot reverse it to learn your email. No plaintext email is stored — your email address is stored only as client-side encrypted ciphertext that the server cannot decrypt
When actions happen — timestamps for uploads, shares, and searches are recorded for operational purposes
Coarse document size — a rough size category (not the exact byte count) is stored for storage management
Processing status — whether a document is uploading, processing, ready, unsupported, or failed
Resource identifiers — some route paths and tables retain plaintext identifiers where tokenization has not yet been applied (see below)

What The Server Cannot See

The following is never available to the server in any form:

The content of any document
Document titles, filenames, or descriptive metadata
What you are searching for
The encryption keys used to protect your documents
The contents of audit log entries — audit records are encrypted inside the enclave and stored as opaque blobs

Sensitive stored relationships use one-way HMAC blind tokens instead of plaintext identifiers in the following areas:

Document ownership — owner_token replaces owner_id in reservations and document rows
Grant-document association — doc_token replaces document_id in grant reservations
Entity member document access — entity_token, doc_token, and user_member_token replace UUID columns in entity document key rows
Search and membership lookups — blind index tokens replace plaintext user and entity identifiers in blind index tables

Some tables retain plaintext identifiers where the sensitivity is lower or where admin operations require direct lookup (access_grants.document_id, entity_memberships.user_id/entity_id). The tokenization work is ongoing; the strongest guarantees apply to the paths enumerated above.

The database is not uniformly free of plaintext identifiers. A database dump would expose operational metadata, statuses, timestamps, and the retained plaintext columns noted above. The zero-knowledge guarantees are strongest for sensitive content, key material, and the tokenized relationship tables.

How The Boundary Is Enforced

Client-Side Encryption

Before a document ever leaves your device, it is encrypted with a key that only you possess. Each document receives its own unique encryption key, generated fresh at upload time. The server receives and stores only the encrypted result.

Even if someone gained complete access to the server's storage, they would see only ciphertext — without the corresponding keys, the data is unreadable.

Hardware-Isolated Secure Enclave

Some operations — like generating search indexes or processing documents for verification — require temporary access to decrypted data. These operations happen inside a secure enclave: a hardware-isolated environment with no direct network access and no way for server operators to inspect what happens inside it.

Before the enclave can access any encryption keys, it must pass attestation — a cryptographic proof that it is running the exact, unmodified software that was audited and approved, inside a genuine hardware-isolated environment.

A tampered, replaced, or impersonated enclave cannot pass attestation and therefore cannot access any keys. This is a cryptographic guarantee enforced by the hardware itself, not a software policy check.

Throughout the system, sensitive identifiers are replaced with one-way cryptographic tokens. The server can compare tokens for equality — "does this token match that token?" — but it cannot reverse-engineer the original identifier from the token.

This applies to document references, ownership records, search indexes, sharing relationships, and membership lookups in the areas where tokenization has been applied.

Oblivious Authentication

Even the login process is zero-knowledge. The platform uses an Oblivious Pseudorandom Function (OPRF) so the server can evaluate cryptographic operations on your credentials without ever seeing them in plaintext.

During login, your device blinds your email as a cryptographic point and sends it to the server. The server multiplies by its secret key and returns the result — without learning the input. Your device unblinds the result to derive a login bucket index. The server uses this index to retrieve candidate records, but it never sees your actual email during the authentication flow.

Tamper-Evident Audit Trail

Authentication and administrative events are recorded in an encrypted audit log. Each entry is encrypted inside the secure enclave before it leaves, and entries are chained together cryptographically so that tampering with or deleting a single record would break the chain.

The server stores these encrypted entries but cannot read them. Only the enclave, after passing attestation, can decrypt an entry for authorized review.

The Boundary: At Rest vs. In Transit

The platform enforces zero-knowledge at rest for sensitive content and key material — document content, metadata, wrapped keys, and the tokenized relationship tables contain no reversible plaintext.

API request routing does use some plaintext identifiers in URL paths for technical routing purposes. This is an accepted, well-scoped trade-off:

At rest — sensitive content, key material, and tokenized relationships are zero-knowledge. The database also stores operational metadata and some retained plaintext identifiers as documented above
In transit — request and response bodies use blind tokens for authorization and document lookups. Some route paths include resource identifiers for request routing, but sensitive authorization decisions are token-based

The key insight: persistent sensitive associations are the threat model. Tokenized ownership, grant, membership, and document-key relationships cannot be reconstructed from a database dump — without the keys, the tokens are one-way.

Post-Quantum Key Transport

Key transport for entity membership, document delivery, and grant access uses a hybrid classical + post-quantum scheme (ML-KEM-1024 + X25519), protecting long-lived key material against future quantum attacks.

The platform DEK wrap path (wrapped_dek_pmk) currently uses P-256 ECDH. Migration to hybrid KEM for that field is tracked as engineering work.

The same hybrid approach applies to digital signatures used for verifying document seals and authorization tokens.