Skip to content

juditha on pypi PyPI Downloads PyPI - Python Version Python test and package pre-commit Coverage Status AGPLv3+ License Pydantic v2

Juditha

A super-fast in-process lookup service for canonical names, backed by tantivy.

juditha exists to tame the noise that follows from Named Entity Recognition: given a huge list of known names (company registries, persons of interest, sanctions lists), it tells you whether a span produced by your NER pipeline corresponds to one of them, even when the casing, accents, token order, or spelling differs.

The implementation uses a pre-populated names database and index. Data is either FollowTheMoney entities or simply list of names.

What you can do with it

Where to go next

The name

Juditha Dommer was the daughter of a coppersmith and raised seven children, while her husband Johann Pachelbel wrote a canon.

Versioning

To mark the compatibility with followthemoney, juditha follows the same major version, which is currently 4.x.x.

juditha, (C) 2024 investigativedata.io. (C) 2025, 2026 Data and Research Center – DARC. Licensed under AGPLv3 or later. See NOTICE and LICENSE.