支持48种语言
整个平台的完整PII检测和匿名化
spaCy NLP - Runs Locally (25种语言)
EnglishGermanSpanishFrenchItalianPortugueseDutchPolishRussianJapaneseChineseKoreanRomanianGreekCroatianSlovenianMacedonianSwedishDanishNorwegianFinnishUkrainianLithuanianCatalanTurkish
Stanza NER - Runs Locally (7种语言)
BulgarianHungarianHebrew (RTL)VietnameseAfrikaansArmenianBasque
XLM-RoBERTa Transformer - Runs Locally (16种语言)
Arabic (RTL)HindiCzechSlovakIndonesianThaiPersian (RTL)SerbianLatvianEstonianMalayBengaliUrdu (RTL)SwahiliTagalogIcelandic
RTL支持
阿拉伯语希伯来语波斯语乌尔都语
由先进的NLP驱动
三个NLP引擎协同工作,实现最大语言覆盖
- 懒加载模型(最多5个缓存),提高内存效率
- 自动语言检测
- 混合语言文档处理
- 特定语言的实体模式
Country-Specific Formats
We detect PII in formats specific to each country and region.
European Formats
- German: Personalausweis, Steuer-ID, Reisepass
- French: NIR, Carte Nationale, Permis
- Italian: Codice Fiscale, Carta d'Identità
- Spanish: DNI, NIE, NIF
- Dutch: BSN, Rijbewijs
- Polish: PESEL, NIP, REGON
Asia-Pacific Formats
- Japan: My Number, Passport
- India: Aadhaar, PAN, GSTIN, Vehicle Registration
- Thailand: National ID, Tax ID, Passport
- Indonesia: NIK, NPWP, Passport
- Vietnam: CCCD, Tax Code, Passport
- Malaysia: MyKad, Tax ID, Passport
Americas, Africa & Middle East
- US: SSN, Driver's License, Passport
- UK: National Insurance, NHS Number
- Canada: SIN, Driver's License
- Australia: TFN, Medicare, ABN
- Kenya: National ID, KRA PIN, Passport
- South Africa: ID Number, Tax Number, Passport