Skip to content

Query

One of the main features of ftmq is a high-level query interface for Follow The Money data that is stored in a file or a statement-based store powered by nomenklatura.

To get familiar with the Follow The Money ecosystem, you can have a look at this pad here.

Working with Query

The Query instance can be used to filter a stream of entities or to lookup entities from a store. The object itself acts independently and can be used in other applications as well.

from ftmq import Query

# a basic query object to filter for a schema
q = Query(schema="Person")

Query objects can be chained:

q = Query()
q = q.where(dataset="my_dataset").where(schema="Person")

Filter lookups

The Query.where function can take filters for datasets, schema, properties (values) or entity ids.

Dataset

q = Query().where(dataset="my_dataset")

Schema

q = Query().where(schema="Person")

Property

Any valid property from the model can be queried.

q = Query().where(name="Jane")

Entity ID

q = Query().where(id="id-jane")

Combining

These filters can be combined in a single where call:

q = Query().where(dataset="my_dataset", schema="Person", name="Jane")

Or chained:

q = Query().where(dataset="my_dataset").where(schema="Person").where(name="Jane")

Value comparators

Lookups not only filter for equal value lookup, but as well provide these comparators that can be appended with __<comp> to the property lookup.

  • eq / = - equals
  • not - not equals
  • gt - greater than
  • lt - lower than
  • gte - greater or equal
  • lte - lower or equal
  • like - SQLish LIKE (use % placeholders)
  • ilike - SQLish ILIKE, case-insensitive (use % placeholders)
  • in - test if the value is in the given array of filter values
  • not_in - test if the value is not in the given array of filter values
  • null - testing for NULL values
# Payments with a value >= 1000 €
q = Query().where(schema="Payment", amountEur__gte=1000)

# Events before october 2022
q = Query().where(schema="Event", date__lt="2022-10")

# Persons starting with "J"
q = Query().where(schema="Person", name__startswith="J")
# the same could be accomplished:
q = Query().where(schema="Person", name__ilike="j%")

# All Janes and Joes
q = Query().where(firstName__in=["Jane", "Joe"])

# ID prefixing
q = Query().where(id__startswith="de-")

# Exclude a specific legal form
q = Query().where(legalForm__not="gGmbH")

# Filter for null (empty) properties
q = Query().where(startDate__null=True)

Sorting

A Query result can be sorted by properties in ascending or descending order. Subsequent calls of Query.order_by override a previous order_by definition.

# sort by name
q = Query().order_by("name")

# sort by last name in descending order
q = q.order_by("lastName", ascending=False)

Slicing

Slicing can be used to get the top first results:

q = Query()[:100]

# get ten results starting from the 5th
q = q[5:15]

# Get only the 2nd result
q = q[1] # 0-index

Putting it all together

Get the 10 highest Payments of a specific dataset within october 2024:

q = Query().where(dataset="my_dataset") \
    .where(schema="Payment") \
    .where(date__gte="2024-10", date__lt="2024-11") \
    .order_by("amountEur", ascending=False)
q = q[:10]

Using a Query instance

The query object can be passed to smart_read_proxies:

from ftmq.io import smart_read_proxies
from ftmq import Query

q = Query().where(dataset="my_dataset", schema="Event")

for proxy in smart_read_proxies("s3://data/entities.ftm.json", query=q):
    assert proxy.schema.name == "Event"

Use for a store view:

from ftmq.store import get_store
from ftmq import Query

q = Query().where(dataset="my_dataset", schema="Event")
store = get_store("sqlite:///followthemoney.store")
view = store.query()

for proxy in view.entities(q):
    assert proxy.schema.name == "Event"

See stores documentation

Reference

Full reference