Regex-First: Why It Matters
Our Approach: Regex + NLP
- 317 regex recognizers: 100% reproducible for structured data
- NLP for names & locations with confidence scores
- Fully auditable — every detection traceable to a pattern or model
- Transparent: you always know what matched and why
- Fast, predictable performance
- 48 languages across 3 NLP engines
AI-Only Approaches
- All detections are probabilistic
- Can't explain why something was flagged
- Requires large training datasets
- Difficult to audit for compliance
- Higher compute costs (GPU needed)
- Model drift degrades accuracy over time
The 10-Step Process
From input to output, here's exactly what happens to your document
Input Text
Submit your document via web interface, API, or Office Add-in
Language Detection
System identifies the document language for optimal processing
Tokenization
Text is broken into tokens for pattern matching
Pattern Matching
317 regex recognizers and NLP models scan for 320+ entity types across 70+ countries
Context Analysis
Surrounding text improves detection accuracy
Confidence Scoring
Each detection receives a confidence score
Entity Classification
Detected items are categorized by type
Review Results
See all detections with positions and scores
Apply Anonymization
Choose your method: Replace, Redact, Hash, Encrypt, or Mask
Output Document
Download your anonymized document
MCP Server: Privacy-First AI Integration
How your data flows through the MCP Server to keep AI tools safe
The MCP Server acts as a privacy shield, intercepting requests from AI tools, anonymizing PII, processing safe data through AI, and optionally restoring original values.
AI Tool Request
Your AI tool (Cursor, Claude) sends a request containing PII
MCP Server Intercepts
Server analyzes and detects all PII entities
Anonymization
PII is replaced with tokens or redacted
AI Processing
AI receives and processes only anonymized data
Response Return
AI response comes back through MCP Server
De-tokenization
Optional: Original values restored for user