Anonymizer Guide -- Protecting Personal Information

Last Updated: 2026-02-09


The Anonymizer transforms detected PII into safe, non-identifiable values. This guide explains each anonymization method, when to use it, and how to configure it.


Table of Contents#

  1. Overview
  2. Replace
  3. Redact
  4. Hash
  5. Encrypt
  6. Mask
  7. Choosing the Right Method
  8. Per-Entity Configuration
  9. Download Options

Overview#

After the Analyzer identifies PII in your text, the Anonymizer applies your chosen transformation to each detected entity. You can apply a single method globally or configure different methods for different entity types.


Replace#

What it does: Substitutes PII with realistic fake data of the same type.

Examples#

OriginalReplaced
John SmithJane Doe
john@example.comsarah.jones@mail.net
555-0123555-9876

Characteristics#

  • Produces natural-looking text that reads like the original.
  • Fake values are generated randomly and are not derived from the original.
  • Different values are generated each time, even for the same input.
  • Not reversible -- the original values cannot be recovered.

Best For#

  • Creating realistic sample data for testing or demonstrations.
  • Sharing documents where readability matters.
  • Training data preparation where the text structure must be preserved.

Redact#

What it does: Removes PII entirely and replaces it with a placeholder.

Examples#

OriginalRedacted
John Smith[PERSON]
john@example.com[EMAIL_ADDRESS]
555-0123[PHONE_NUMBER]

Characteristics#

  • The placeholder indicates the entity type that was removed.
  • The original value is completely deleted from the output.
  • Not reversible -- the original values cannot be recovered.

Best For#

  • Strict compliance scenarios where no trace of the original data should remain.
  • Legal documents where PII must be fully removed.
  • Situations where readability of the actual values is not important.

Hash#

What it does: Converts PII to a fixed-length SHA-256 hash value.

Examples#

OriginalHashed
John Smitha1b2c3d4e5f6...
john@example.comf7g8h9i0j1k2...

Characteristics#

  • Consistent: the same input always produces the same hash. "John Smith" will always hash to the same value.
  • One-way: hashes cannot be reversed to recover the original value.
  • Deterministic linking: because the same input always produces the same hash, you can link records across documents without knowing the original value. If "John Smith" appears in two documents, the hash will be identical in both.
  • Not reversible -- the original values cannot be recovered from the hash.

Best For#

  • Data analysis where you need to track unique entities without knowing their identity.
  • Record linkage across multiple anonymized documents.
  • Statistical analysis that requires entity consistency.

Encrypt#

What it does: Encrypts PII using AES-256-GCM encryption with your personal key.

Examples#

OriginalEncrypted
John Smith<encrypted:aGVsbG8gd29ybGQ=>
john@example.com<encrypted:Zm9vYmFyYmF6>

Characteristics#

  • Reversible: you can decrypt the text later using the same encryption key.
  • Uses AES-256-GCM, a strong authenticated encryption standard.
  • Each encryption uses a unique initialization vector (IV), so encrypting the same value twice produces different ciphertext.
  • The encryption key is your personal key -- only you can decrypt the result.
  • This is the only reversible method. If you need to recover original values later, use Encrypt.

Best For#

  • Temporary anonymization where you may need the original data later.
  • Sharing documents with authorized parties who have the decryption key.
  • Workflows that require both anonymized and original versions.

See the Deanonymizer Guide for instructions on decrypting.


Mask#

What it does: Partially obscures PII while preserving some characters for recognition.

Examples#

OriginalMasked
john@example.comj***@e*****.com
4111-1111-1111-11114111-****-****-1111
John SmithJ*** S****

Characteristics#

  • Preserves enough of the original to recognize the general format.
  • The number and position of visible characters depends on the entity type.
  • Not reversible -- the masked characters cannot be recovered.

Best For#

  • Customer-facing documents where partial visibility aids recognition.
  • Receipts, confirmations, or statements where users need to identify their own data.
  • Audit logs where some traceability is needed.

Choosing the Right Method#

MethodReversibleReadableConsistentUse Case
ReplaceNoHighNoRealistic sample data, testing
RedactNoLowN/AStrict compliance, legal
HashNoLowYesAnalytics, record linkage
EncryptYesLowNoTemporary anonymization, sharing with authorized parties
MaskNoMediumYesCustomer-facing documents, audit logs

Decision Guide#

  • Need to recover original data later? Use Encrypt.
  • Need to link records across documents? Use Hash.
  • Need natural-looking text? Use Replace.
  • Need complete removal? Use Redact.
  • Need partial visibility? Use Mask.

Per-Entity Configuration#

You can set different anonymization methods for different entity types within the same operation.

How to Configure#

  1. After running an analysis, go to the anonymization settings panel.
  2. The global method applies to all entity types by default.
  3. Expand individual entity types to override the global method.
  4. For example, you might Encrypt names (so you can recover them later) while Redacting credit card numbers (permanent removal).

Example Configuration#

Entity TypeMethodReason
PERSONEncryptMay need to restore names later
EMAIL_ADDRESSReplaceKeep realistic email format for testing
CREDIT_CARDRedactNo need to retain card numbers
PHONE_NUMBERMaskPartial visibility for verification

Download Options#

After anonymization, retrieve your results:

  • Copy to clipboard -- click the copy icon to copy the full anonymized text.
  • Download as text file -- save the anonymized output as a .txt file.
  • Compare view -- view the original and anonymized text side by side to verify the result before downloading.