PII Detection: 320+ Entity Types

Our detection engine combines 317 custom regex recognizers with NLP models to identify 320+ types of personal information across 70+ countries. Same input, same output - every time.

How Detection Works

Regex Pattern Matching (Structured PII)

317 custom PatternRecognizers with regex patterns detect structured data like national IDs, tax numbers, passports, and driver licenses. Each pattern uses boundary assertions to prevent false matches in code or structured data.

NLP Named Entity Recognition (Names & Locations)

spaCy (25 languages), Stanza NER (7 languages), and XLM-RoBERTa transformers (16 languages) detect unstructured PII like person names, locations, and organizations that cannot be captured by regex alone. All models run on our own servers in Germany — no data is ever sent to Meta, Google, Stanford, or any third party.

Confidence Scoring

Each detection includes a confidence score (0-1). Highly-specific formats (e.g., German IBAN DE89 3704 0044 0532 0130 00) score 0.85+, while generic digit patterns score 0.3-0.5 and rely on context words for confirmation.

Context Word Analysis

Each recognizer has context words in the relevant language (e.g., 'Personalausweis' for German IDs, 'kitambulisho' for Kenyan IDs). When context words appear near a match, the confidence score is boosted.

Supported Entity Types

Comprehensive coverage of personal information types across categories

Personal Identifiers

  • Person Names
  • Email Addresses
  • Phone Numbers
  • Date of Birth
  • Age
  • Gender
  • Nationality

Financial Information

  • Credit Card Numbers
  • IBAN
  • BIC/SWIFT
  • Bank Account Numbers
  • Tax IDs
  • VAT Numbers

Government IDs

  • Social Security Numbers (SSN)
  • National ID Numbers
  • Passport Numbers
  • Drivers License
  • Health Insurance IDs

Location Data

  • Street Addresses
  • Cities
  • ZIP/Postal Codes
  • Countries
  • GPS Coordinates

Digital Identifiers

  • IP Addresses (v4/v6)
  • MAC Addresses
  • URLs
  • Domain Names
  • User IDs

Organization Data

  • Company Names
  • Organization IDs
  • Registration Numbers
  • Department Names

Temporal Data

  • Dates
  • Times
  • Date Ranges
  • Timestamps

International Formats

  • German ID (Personalausweis)
  • UK National Insurance
  • Spanish DNI/NIE
  • Italian Codice Fiscale
  • And 70+ more country-specific formats

Custom Entity Support

Need to detect custom patterns? Create your own entity types with regex patterns or use our AI-assisted pattern generator.

Manual Pattern Creation

Define regex patterns for proprietary identifiers like internal employee IDs, project codes, or custom reference numbers.

AI Pattern Generator

Describe what you want to detect in plain language, and our AI generates optimized regex patterns for you.

Start Detecting PII Today

Try our detection engine free with 200 tokens per cycle. No credit card required.