Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.caplena.com/llms.txt

Use this file to discover all available pages before exploring further.

Enable anonymization to automatically remove personally identifiable information (PII) from your text before analysis, keeping you compliant with GDPR and other privacy regulations.
Anonymization cannot be undone. Only the anonymized version is stored in Caplena. The original data is automatically and permanently deleted from Caplena’s servers within 7 days of upload.

Enabling Anonymization

Anonymization is configured when creating a new project:
  1. Click New Project and proceed through the upload flow
  2. At the Anonymization step, toggle it on
  3. Select which PII types to anonymize (see below)
  4. Click Continue to proceed with your import

Choosing What to Anonymize

Once enabled, a settings panel lets you select the PII types to mask:
Anonymization settings panel
For more granular control, click Advanced Settings to include additional PII types (ZIP code, religion, gender, and more), add custom sensitive data fields, or tailor anonymization to industry-specific compliance requirements.
Advanced anonymization settings

Allow-list & Block-list

Fine-tune anonymization behavior with two optional lists: Allow-list — Terms that should never be anonymized, even if they look like names (e.g. Smith, John Doe). Block-list — Terms that should always be anonymized, even if they aren’t names (e.g. is, very curious). To add terms, click “Add term” or paste a list directly from Excel. Matching is case-insensitive and whole-word only — partial matches are ignored.
InputOutputNotes
Hello Mr SmithHello Mr SmithExact match; preserved due to allow-list
Hello Mr smithHello Mr smithCase-insensitive match; preserved
Hello Mr SmithsonHello Mr [NAME_FAMILY_1]Not an exact match; anonymized
I am John DoeI am John DoeExact match; preserved
Doe[NAME_1]Not a full match; anonymized
Block-list exampleis and very curious configured as blocked terms:
InputOutputNotes
This is the Smith familyThis [CUSTOM_1] the Smith family”is” is in the block-list; anonymized
I am very curiousI am [CUSTOM_1]Exact phrase match; anonymized
Just curiousJust curiousNot a full match; not anonymized

Address vs. Location

Caplena distinguishes between two related but different PII types:
TypeWhat it capturesExample
AddressStructured location formats — street, number, ZIP, city25 Oxford Street, London W1D 2LF
LocationGeneral geographic mentions or landmarksCentral Park, Northern California
“I visited Lake Victoria.” → with Location enabled → “I visited [location].” “I live at 12 Abbey Road, 23783 London.” → with Address enabled → “I live at [address].”
How Address, Street, City, and ZIP interact:
OptionExample (Original Text)Output
Address only”Anna Smith, 742 Evergreen Terrace, Springfield, IL 62704, anna@example.com""Anna Smith, [LOCATION_ADDRESS_1], anna@example.com
ZIP/Postcode only”742 Evergreen Terrace, IL 62704""742 Evergreen Terrace, [postal code]“
City only”I moved to London last year.""I moved to [city] last year.”
Street only”She lives at Oxford Street.""She lives at [street].”
If you anonymize too much or something goes wrong, you’ll need to re-upload the data in a new project. Reach out to support — we’re happy to help and will reimburse credits if needed.

Anonymization & Translation

Anonymization runs before translation. Anonymized source text will also produce anonymized translations — the two features work seamlessly together.
Not all languages are currently supported for anonymization. Texts in unsupported languages may be only partially anonymized.
Supported languages:
LanguageISO Code
Afrikaansaf
Arabicar
Bambarabm
Belarusianbe
Bengalibn
Bulgarianbg
Burmesemy
Cantonese (Traditional)zh-TW
Catalanca
Croatianhr
Czechcs
Danishda
Dutchnl
Englishen
Estonianet
Finnishfi
Frenchfr
Georgianka
Germande
Greekel
Hebrewhe
Hindihi
Hungarianhu
Icelandicis
Indonesianid
Italianit
Japaneseja
Khmerkm
Koreanko
Latvianlv
Lithuanianlt
Luxembourgishlb
Malayms
Mandarin (Simplified)zh-CN
Moldovanro
Norwegian (Bokmål)nb
Persian (Farsi)fa
Polishpl
Portuguesept
Punjabipa
Romanianro
Russianru
Slovaksk
Sloveniansl
Spanishes
Swahilisw
Swedishsv
Tagalogtl
Tamilta
Thaith
Turkishtr
Ukrainianuk
Vietnamesevi
Last modified on May 18, 2026