What Presidio, Private AI, and Protecto Don't Offer

The Assumption That Kills Flexibility

Most PII tools assume anonymization is a one-way operation. Once data is redacted, it is gone forever.

This assumption works for some use cases. But it fails catastrophically for:

Legal discovery - Courts may order original documents
Clinical trials - Adverse event reporting requires patient identification
Audit requirements - Regulators need to verify what was protected
Research - Linking anonymized records back to sources validates findings

What Competitors Offer

Microsoft Presidio

Presidio provides four anonymization operators: Replace, Redact, Hash, and Mask. None are reversible.

To achieve reversibility with Presidio, you must build a custom encryption operator, manage an external key store, implement decryption logic, and maintain a separate audit system.

Private AI

Private AI offers de-identification for AI workflows but focuses on irreversible anonymization for privacy preservation. Reversibility is not a core feature.

Protecto

Protecto criticizes deterministic masking limitations: if an AI modifies the text slightly, the mapping breaks. Token-based approaches require maintaining external mapping tables that can be lost or corrupted.

Why Reversibility Matters

GDPR Compliance

GDPR explicitly recognizes pseudonymization as a valid data protection measure - and pseudonymization is reversible by definition.

HIPAA Safe Harbor

HIPAA allows pseudonymization with a re-identification key held by a covered entity. Clinical trials routinely require this capability.

Legal Discovery

Courts may order production of original documents. Irreversible anonymization makes discovery obligations impossible to fulfill.

Audit Compliance

Regulators may ask what PII was in a document. With reversible encryption, you can demonstrate, decrypt, re-encrypt, and document.

Our Approach: AES-256-GCM Encryption

cloak.business offers five anonymization methods. Only Encrypt is reversible.

Method	Reversible?	Use Case
Replace		Substitute with fake data
Redact		Remove entirely
Mask		Partial obscuring
Hash		One-way transformation
Encrypt		AES-256-GCM reversible

Technical Specifications

Algorithm: AES-256-GCM
Key derivation: Argon2id
Nonce: Random 12-byte per encryption
Authentication: GCM tag integrity check
Key storage: Client-side only (zero-knowledge)

The Self-Contained Advantage

Simple tokenization requires external token-to-value mapping. Mapping security becomes critical. Mapping can be lost or corrupted. There is no embedded audit trail.

Our approach: Encrypted value embedded in token itself. No external mapping required. Self-contained and auditable. If the AI returns modified text, the encrypted tokens survive.

Key Takeaways

Irreversible anonymization blocks legitimate use cases - Legal, audit, research all need reversibility
GDPR and HIPAA explicitly permit pseudonymization - Regulators expect this capability
Token-based approaches have fragility - External mappings can break or be lost
Self-contained encryption is robust - No external dependencies
Reversible encryption is a differentiator - Most tools simply do not offer it

Sources

Why 317 Pattern Recognizers Beat 30

Microsoft Presidio ships with ~30 recognizers focused on US formats. Learn why 317 custom recognizers with checksum validation achieve 82% higher accuracy for global PII detection.

When SaaS-Only Isn't Enough

Air-gapped networks and data sovereignty mandates require offline PII processing. Learn why SaaS-only tools fail and how Desktop App provides full offline capability.