Kyndex
Concepts

Encrypted Search

How the platform lets you search across encrypted documents without the server learning your search terms, results, or document contents.

Search works across your encrypted documents without the server ever learning your search terms, your results, or the content of your files.

In most platforms, search requires the server to read your data. When you type a query, the server scans your documents in plaintext, finds matches, and returns results. This means the server sees both what you're looking for and what you have.

This platform takes a fundamentally different approach.

How Encrypted Search Works

Search uses blind indexing: searchable values and queries are transformed into one-way blind index tokens before reaching the server. The server compares tokens for matches but cannot reverse a token to see what it represents.

Think of it this way: Imagine you and a friend each have a secret stamp. You stamp your search query, and the server checks if any stored stamps match. The server can compare stamps but has no idea what picture the stamp makes — it just knows "these two impressions are identical."

How tokens are created

Blind index tokens are generated through two separate paths depending on scope:

Consumer indexes — generated on your device and submitted to the server via POST /v1/documents/consumer-indexes. Your device normalizes the text, derives tokens using your personal search key, and submits them. The server never sees the plaintext values.

Entity-scoped indexes — generated inside the secure enclave during document processing, when entity context is present. The enclave derives tokens using the organization's search key and returns only the opaque token values. Plaintext never leaves the enclave.

Both paths produce the same token format. The server stores them and can compare them for equality, but cannot reverse them to recover the original text.

At search time

When you search for something:

  1. Your device transforms your search query into a token using the same search key and normalization process.
  2. The token is sent to the server.
  3. The server checks which stored tokens match your query token.
  4. Matching documents are returned — still encrypted.

The server sees only opaque tokens moving in both directions. It cannot determine what word or phrase generated a given token.

Search Scopes

The platform supports two types of search:

Searches across your own documents using your personal search key (BIK). Only you can generate tokens that match your personal indexes, so no one else can search your documents.

If you're a member of an organization, you can search across the organization's shared document pool. Organization search tokens are derived from the organization's master secret inside the secure enclave. All members with the appropriate key can search the organization's documents.

A document is searchable within a scope once blind index tokens exist for that scope — processing status alone does not guarantee it. A document that has completed processing may not appear in personal search until consumer indexes have been submitted, or in organization search until entity-scoped tokens have been generated.

Tokens are created for several categories of document information:

  • Document type — the kind of document (for example, a passport or a driver's license).
  • Document fields — specific pieces of information extracted from the document, such as a name or date of birth.
  • Dates — date-based values attached to the document.
  • Tags — labels you apply to organize your documents.
  • Text content — words and phrases extracted from the document itself.

Each category is tokenized independently. When you search, you can target a specific category or search across all of them.

What The Server Learns (And Doesn't)

The server learns:

  • That a search was performed (it sees the request).
  • How many results matched (it performs the token comparison).
  • Which encrypted documents correspond to the matching tokens.
  • That two queries produced the same token — it can observe repeated searches for the same term within a scope, though it cannot identify what the term is.

The server does NOT learn:

  • What you searched for.
  • What the matching documents contain.
  • Why those particular documents matched.
  • The relationship between different search queries.
  • Who performed the search — your identity is represented by a blind member token derived from your personal key, not your user ID.

Token Properties

Blind index tokens have several important properties:

  • Deterministic — the same input always produces the same token (which is what makes matching work).
  • One-way — you cannot reverse a token to recover the input.
  • Key-bound — tokens generated with different search keys do not match, even for the same input. Your personal tokens and an organization's tokens are completely independent.
  • Normalized — text is processed through a consistent normalization step before token generation, so searches are case-insensitive and handle common text variations.

Anonymous Document References

The link between a search token and its corresponding document is protected. Rather than storing a plaintext document identifier in the search index, the platform stores a blind document token — a one-way transformation of the document's identity.

This means:

  • The search index contains no plaintext document identifiers.
  • The server cannot correlate search activity with specific documents except through the cryptographic matching process.
  • Even if the search index were fully exposed, an attacker could not determine which documents correspond to which search tokens.

For more on this principle, see Blind Routing.

  • Zero-Knowledge Model — the foundation: what the server can and can't see
  • Key Hierarchy — the different types of keys that protect your data, including the search keys used for token generation
  • Document Lifecycle — how documents move from upload through processing to becoming searchable
  • Blind Routing — how sensitive relationships are tokenized to prevent relationship mapping

On this page