AI-Assisted Entity Creation
Last Updated: 2026-02-09
Create custom detection patterns using plain language. Describe the type of data you want to detect, and the AI generates an optimized regex pattern for you -- no regex knowledge required.
What It Is#
cloak.business includes over 290 built-in entity types covering common PII categories like names, emails, phone numbers, and national IDs across 75+ countries. But every organization has unique identifiers -- internal codes, proprietary reference numbers, custom formats -- that are not covered by default.
AI-Assisted Entity Creation lets you describe these custom patterns in natural language and automatically generates a regex pattern to detect them.
How It Works#
- You describe the pattern you want to detect in plain language.
- The AI analyzes your description and generates an optimized regex pattern.
- You test the pattern against sample text.
- You refine the description or edit the regex directly if needed.
- You save the custom entity for use in all future analyses.
Example#
Your description:
"I want to detect German license plate numbers like B-AB 1234 or M-XY 567"
AI generates:
- Entity type:
DE_LICENSE_PLATE - Regex pattern: A pattern matching 1-3 letter city codes, a hyphen, 1-2 letters, a space, and 1-4 digits
- Confidence score: 0.85
You can then test it against sample text to verify detections before saving.
Use Cases#
| Use Case | Example Format |
|---|---|
| Employee IDs | EMP-12345, STAFF/2026/001 |
| Internal project codes | PRJ-ALPHA-0042 |
| Custom reference numbers | REF:2026-02-00123 |
| Proprietary account IDs | ACCT-US-987654 |
| Industry-specific identifiers | Insurance policy numbers, medical record codes |
| Tracking numbers | TRK-EU-20260209-5A3B |
| Hardware serial numbers | SN:XK-449912-B |
Step-by-Step Guide#
1. Open Custom Entities#
Navigate to Settings > Custom Entities.
2. Start AI Creation#
Click Create with AI.
3. Describe Your Pattern#
In the description field, write a plain-language explanation of what you want to detect. Be specific and include examples:
"Our invoice numbers follow the format INV-YYYY-NNNNN, where YYYY is the year and NNNNN is a 5-digit sequence. For example: INV-2026-00142, INV-2025-98001."
The more examples you provide, the more accurate the generated pattern will be.
4. Review the Generated Regex#
The AI presents the generated regex pattern along with:
- Entity type name -- automatically suggested based on your description.
- Pattern explanation -- a plain-language breakdown of what the regex matches.
- Confidence score -- the default detection confidence (adjustable).
Review the pattern and make edits if needed.
5. Test with Sample Text#
Paste sample text containing your identifier into the test area. The system highlights all matches so you can verify:
- True positives -- correctly detected instances.
- False positives -- incorrectly matched text.
- False negatives -- missed instances.
Refine your description or edit the regex directly until results are satisfactory.
6. Save#
Click Save. Your custom entity is now available in all analysis and anonymization operations, alongside the built-in entity types.
Limitations#
AI-Assisted Entity Creation works best for structured, predictable formats:
| Works Well | Less Effective |
|---|---|
| Fixed-format codes and IDs | Freeform natural language descriptions |
| Numeric sequences with known delimiters | Highly variable formats with no clear structure |
| Alphanumeric patterns with consistent structure | Context-dependent detection (e.g., "the project name") |
| Patterns with known prefixes or suffixes | Ambiguous short strings that match common words |
For freeform or context-dependent detection, consider using the built-in NLP-based entity types (such as PERSON or ORGANIZATION) which use contextual analysis rather than pattern matching.
Manual Alternative#
If you prefer writing regex patterns directly, you can skip the AI step:
- Go to Settings > Custom Entities.
- Click Create Manually.
- Enter the entity type name, regex pattern, and confidence score.
- Optionally add context words -- terms that, when found near the pattern, increase detection confidence (e.g., "invoice", "ref", "order" for an invoice number pattern).
- Test and save.
Managing Custom Entities#
- Edit -- update the regex, confidence score, or context words at any time from Settings > Custom Entities.
- Disable -- temporarily turn off a custom entity without deleting it.
- Delete -- permanently remove a custom entity.
- Export/Import -- share custom entity configurations across accounts.
Custom entities are processed alongside the built-in 317 pattern recognizers during every analysis. The backend enforces limits on ad-hoc recognizers (max 50 per request, 10 patterns each).
Document maintained by cloak.business