Multi-Language Support — 48 Languages

Detect and anonymize PII in 48 languages with native pattern support. Full RTL support for Arabic, Hebrew, Persian, and Urdu.

Try It Free Language Documentation

48 Languages Supported

Full PII detection and anonymization across the entire platform

spaCy NLP - Runs Locally (25 languages)

EnglishGermanSpanishFrenchItalianPortugueseDutchPolishRussianJapaneseChineseKoreanRomanianGreekCroatianSlovenianMacedonianSwedishDanishNorwegianFinnishUkrainianLithuanianCatalanTurkish

Stanza NER - Runs Locally (7 languages)

BulgarianHungarianHebrew (RTL)VietnameseAfrikaansArmenianBasque

XLM-RoBERTa Transformer - Runs Locally (16 languages)

Arabic (RTL)HindiCzechSlovakIndonesianThaiPersian (RTL)SerbianLatvianEstonianMalayBengaliUrdu (RTL)SwahiliTagalogIcelandic

RTL Support

ArabicHebrewPersianUrdu

Powered by Advanced NLP

Three NLP engines working together for maximum language coverage

Lazy-loaded models (max 5 cached) for memory efficiency
Automatic language detection
Mixed-language document processing
Language-specific entity patterns

Country-Specific Formats

We detect PII in formats specific to each country and region.

European Formats

German: Personalausweis, Steuer-ID, Reisepass
French: NIR, Carte Nationale, Permis
Italian: Codice Fiscale, Carta d'Identità
Spanish: DNI, NIE, NIF
Dutch: BSN, Rijbewijs
Polish: PESEL, NIP, REGON

Asia-Pacific Formats

Japan: My Number, Passport
India: Aadhaar, PAN, GSTIN, Vehicle Registration
Thailand: National ID, Tax ID, Passport
Indonesia: NIK, NPWP, Passport
Vietnam: CCCD, Tax Code, Passport
Malaysia: MyKad, Tax ID, Passport

Americas, Africa & Middle East

US: SSN, Driver's License, Passport
UK: National Insurance, NHS Number
Canada: SIN, Driver's License
Australia: TFN, Medicare, ABN
Kenya: National ID, KRA PIN, Passport
South Africa: ID Number, Tax Number, Passport

Frequently Asked Questions

Which 48 languages does cloak.business support?

cloak.business supports Afrikaans, Arabic, Armenian, Basque, Bengali, Bulgarian, Catalan, Chinese, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, German, Greek, Hebrew, Hindi, Hungarian, Icelandic, Indonesian, Italian, Japanese, Korean, Latvian, Lithuanian, Macedonian, Malay, Norwegian, Persian, Polish, Portuguese, Romanian, Russian, Serbian, Slovak, Slovenian, Spanish, Swahili, Swedish, Tagalog, Thai, Turkish, Ukrainian, Urdu, and Vietnamese — with full RTL support for Arabic, Hebrew, Persian, and Urdu.

Does PII detection work the same in all languages?

Detection uses two approaches: regex-based pattern matching for structured data (IDs, phone numbers, tax numbers) and NLP models for unstructured entities (names, locations). Pattern-based detection covers all 48 languages. NLP-based detection is available in languages with trained models.

How are country-specific ID formats handled?

cloak.business includes 317 pattern recognizers covering 70+ countries. Each recognizer validates the specific format, checksum, and structure of national IDs, tax numbers, health identifiers, and financial data for that country.

Can I detect PII in multiple languages within the same document?

Yes. cloak.business can process multilingual documents and detect PII across different languages in a single request. The system automatically identifies which language patterns to apply.

How do I add support for a new language or entity type?

You can create custom entity recognizers using regex patterns or deny lists. This allows you to add domain-specific identifiers or extend coverage to additional formats not yet included in the built-in recognizer library.

Explore Related Features

Multi-language detection works seamlessly with all cloak.business products.

Chrome Extension

Anonymize AI prompts in ChatGPT, Claude, Gemini, and 3 more AI platforms — in any of 48 supported languages.

PII Anonymization API

REST API with JavaScript and Python SDKs. Full multi-language support built in.

Reversible Encryption

Encrypt PII with AES-256-GCM and restore original data anytime with your key.

Is This Right for You?

Best For

✦Global enterprises with multilingual document workflows requiring consistent GDPR and privacy compliance
✦Translation and localization agencies that process PII-containing content in multiple languages
✦Government agencies and NGOs processing citizen data across EU, APAC, and LATAM jurisdictions
✦Legal discovery and compliance teams working with 48 supported language jurisdictions

Not Ideal For

✦Monolingual English-only workflows — the standard plan is sufficient without the overhead of language detection
✦Languages not in the supported 48 — check the entity catalog for specific language and entity coverage
✦Real-time sub-10ms latency requirements — language detection adds processing overhead over English-only

Anonymize in Any Language

Start with 200 free tokens. Works with all 48 languages.