AI-Assisted Entity Creation

Last Updated: 2026-02-09


Create custom detection patterns using plain language. Describe the type of data you want to detect, and the AI generates an optimized regex pattern for you -- no regex knowledge required.


What It Is#

cloak.business includes over 290 built-in entity types covering common PII categories like names, emails, phone numbers, and national IDs across 75+ countries. But every organization has unique identifiers -- internal codes, proprietary reference numbers, custom formats -- that are not covered by default.

AI-Assisted Entity Creation lets you describe these custom patterns in natural language and automatically generates a regex pattern to detect them.


How It Works#

  1. You describe the pattern you want to detect in plain language.
  2. The AI analyzes your description and generates an optimized regex pattern.
  3. You test the pattern against sample text.
  4. You refine the description or edit the regex directly if needed.
  5. You save the custom entity for use in all future analyses.

Example#

Your description:

"I want to detect German license plate numbers like B-AB 1234 or M-XY 567"

AI generates:

  • Entity type: DE_LICENSE_PLATE
  • Regex pattern: A pattern matching 1-3 letter city codes, a hyphen, 1-2 letters, a space, and 1-4 digits
  • Confidence score: 0.85

You can then test it against sample text to verify detections before saving.


Use Cases#

Use CaseExample Format
Employee IDsEMP-12345, STAFF/2026/001
Internal project codesPRJ-ALPHA-0042
Custom reference numbersREF:2026-02-00123
Proprietary account IDsACCT-US-987654
Industry-specific identifiersInsurance policy numbers, medical record codes
Tracking numbersTRK-EU-20260209-5A3B
Hardware serial numbersSN:XK-449912-B

Step-by-Step Guide#

1. Open Custom Entities#

Navigate to Settings > Custom Entities.

2. Start AI Creation#

Click Create with AI.

3. Describe Your Pattern#

In the description field, write a plain-language explanation of what you want to detect. Be specific and include examples:

"Our invoice numbers follow the format INV-YYYY-NNNNN, where YYYY is the year and NNNNN is a 5-digit sequence. For example: INV-2026-00142, INV-2025-98001."

The more examples you provide, the more accurate the generated pattern will be.

4. Review the Generated Regex#

The AI presents the generated regex pattern along with:

  • Entity type name -- automatically suggested based on your description.
  • Pattern explanation -- a plain-language breakdown of what the regex matches.
  • Confidence score -- the default detection confidence (adjustable).

Review the pattern and make edits if needed.

5. Test with Sample Text#

Paste sample text containing your identifier into the test area. The system highlights all matches so you can verify:

  • True positives -- correctly detected instances.
  • False positives -- incorrectly matched text.
  • False negatives -- missed instances.

Refine your description or edit the regex directly until results are satisfactory.

6. Save#

Click Save. Your custom entity is now available in all analysis and anonymization operations, alongside the built-in entity types.


Limitations#

AI-Assisted Entity Creation works best for structured, predictable formats:

Works WellLess Effective
Fixed-format codes and IDsFreeform natural language descriptions
Numeric sequences with known delimitersHighly variable formats with no clear structure
Alphanumeric patterns with consistent structureContext-dependent detection (e.g., "the project name")
Patterns with known prefixes or suffixesAmbiguous short strings that match common words

For freeform or context-dependent detection, consider using the built-in NLP-based entity types (such as PERSON or ORGANIZATION) which use contextual analysis rather than pattern matching.


Manual Alternative#

If you prefer writing regex patterns directly, you can skip the AI step:

  1. Go to Settings > Custom Entities.
  2. Click Create Manually.
  3. Enter the entity type name, regex pattern, and confidence score.
  4. Optionally add context words -- terms that, when found near the pattern, increase detection confidence (e.g., "invoice", "ref", "order" for an invoice number pattern).
  5. Test and save.

Managing Custom Entities#

  • Edit -- update the regex, confidence score, or context words at any time from Settings > Custom Entities.
  • Disable -- temporarily turn off a custom entity without deleting it.
  • Delete -- permanently remove a custom entity.
  • Export/Import -- share custom entity configurations across accounts.

Custom entities are processed alongside the built-in 317 pattern recognizers during every analysis. The backend enforces limits on ad-hoc recognizers (max 50 per request, 10 patterns each).


Document maintained by cloak.business