Export
aggregate fragments to export
handle(ctx, *args, **kwargs)
The default handler of the export stage. It iterates through the entities store, calculates dataset statistics and writes the entities and dataset index to json files.
If neither entities_uri or index_uri is set, no stats for the Dataset
are computed.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ctx
|
DatasetContext
|
The current runtime |
required |
Returns:
| Type | Description |
|---|---|
Dataset
|
The |
Source code in investigraph/logic/export.py
proxy_merge(self, other)
Used to override EntityProxy.merge in investigraph.__init__.py
Source code in investigraph/logic/export.py
Metadata
Location for the resulting dataset metadata, typically called index.json. Again, as investigraph is using fsspec (see above), this can basically be anywhere:
config.yml
command line
investigraph run ... --index-uri sftp://username:password@host/<dataset>/index.json
command line
investigraph run ... --entities-uri ...
export.index_uri
Uri to output dataset metadata. Can be anything that fsspec understands.
Example: s3://<bucket-name>/<dataset-name>/index.json
Default: ./data/<dataset-name>/index.json
export.entities_uri
Uri to output transformed entities. Can be anything that fsspec understands, plus a SQL endpoint (for use with followthemoney-store)
Example:
s3://<bucket-name>/<dataset-name>/entities.ftm.jsonpostgresql://user:password@host:port/database
Default: ./data/<dataset-name>/entities.ftm.json