Query
One of the main features of ftmq
is a high-level query interface for Follow The Money data that is stored in a file or a statement-based store powered by nomenklatura.
To get familiar with the Follow The Money ecosystem, you can have a look at this pad here.
Working with Query
The Query
instance can be used to filter a stream of entities or to lookup entities from a store. The object itself acts independently and can be used in other applications as well.
Query
objects can be chained:
Filter lookups
The Query.where
function can take filters for datasets, schema, properties (values) or entity ids.
Dataset
Schema
Property
Any valid property from the model can be queried.
Entity ID
Combining
These filters can be combined in a single where
call:
Or chained:
Value comparators
Lookups not only filter for equal value lookup, but as well provide these comparators that can be appended with __<comp>
to the property lookup.
eq
/=
- equalsnot
- not equalsgt
- greater thanlt
- lower thangte
- greater or equallte
- lower or equallike
- SQLishLIKE
(use%
placeholders)ilike
- SQLishILIKE
, case-insensitive (use%
placeholders)in
- test if the value is in the given array of filter valuesnot_in
- test if the value is not in the given array of filter valuesnull
- testing forNULL
values
# Payments with a value >= 1000 €
q = Query().where(schema="Payment", amountEur__gte=1000)
# Events before october 2022
q = Query().where(schema="Event", date__lt="2022-10")
# Persons starting with "J"
q = Query().where(schema="Person", name__startswith="J")
# the same could be accomplished:
q = Query().where(schema="Person", name__ilike="j%")
# All Janes and Joes
q = Query().where(firstName__in=["Jane", "Joe"])
# ID prefixing
q = Query().where(id__startswith="de-")
# Exclude a specific legal form
q = Query().where(legalForm__not="gGmbH")
# Filter for null (empty) properties
q = Query().where(startDate__null=True)
Sorting
A Query
result can be sorted by properties in ascending or descending order. Subsequent calls of Query.order_by
override a previous order_by
definition.
# sort by name
q = Query().order_by("name")
# sort by last name in descending order
q = q.order_by("lastName", ascending=False)
Slicing
Slicing can be used to get the top first results:
q = Query()[:100]
# get ten results starting from the 5th
q = q[5:15]
# Get only the 2nd result
q = q[1] # 0-index
Putting it all together
Get the 10 highest Payments
of a specific dataset within october 2024:
q = Query().where(dataset="my_dataset") \
.where(schema="Payment") \
.where(date__gte="2024-10", date__lt="2024-11") \
.order_by("amountEur", ascending=False)
q = q[:10]
Using a Query
instance
The query object can be passed to smart_read_proxies
:
from ftmq.io import smart_read_proxies
from ftmq import Query
q = Query().where(dataset="my_dataset", schema="Event")
for proxy in smart_read_proxies("s3://data/entities.ftm.json", query=q):
assert proxy.schema.name == "Event"
Use for a store view:
from ftmq.store import get_store
from ftmq import Query
q = Query().where(dataset="my_dataset", schema="Event")
store = get_store("sqlite:///followthemoney.store")
view = store.query()
for proxy in view.entities(q):
assert proxy.schema.name == "Event"