leakrfc
Usage:
Options:
--version / --no-version Show version \[default: no-version]
-d TEXT Dataset foreign_id
--install-completion Install completion for the current shell.
--show-completion Show completion for the current shell, to copy it
or customize the installation.
aleph
Aleph related operations
Usage:
load-catalog
Load entities into an Aleph instance
Usage:
Options:
URI Catalog index.json uri \[required]
--host TEXT Aleph host
--api-key TEXT Aleph api key
--metadata / --no-metadata Update collection metadata \[default: metadata]
load-dataset
Load entities into an Aleph instance
Usage:
Options:
URI Dataset index.json uri \[required]
--host TEXT Aleph host
--api-key TEXT Aleph api key
--foreign-id TEXT Aleph foreign_id (if different from dataset)
--metadata / --no-metadata Update collection metadata \[default: metadata]
sync
Sync a leakrfc dataset to Aleph
Usage:
Options:
--host TEXT Aleph host
--api-key TEXT Aleph api key
--folder TEXT Base folder path
--foreign-id TEXT Aleph foreign_id (if different from dataset)
--metadata / --no-metadata Update collection metadata \[default: metadata]
catalog
Show catalog for all existing datasets
Usage:
Options:
-o TEXT \[default: -]
--collect-stats / --no-collect-stats
Collect document statistics \[default: no-
collect-stats]
--names-only / --no-names-only Only show dataset names (`foreign_id`)
\[default: no-names-only]
config
Print current runtime configuration for base archive or given dataset
Usage:
crawl
Crawl documents from local or remote sources
Usage:
Options:
URI \[required]
-o TEXT Write results to this destination
\[default: -]
--skip-existing / --no-skip-existing
Skip already existing files (doesn't check
actual similarity) \[default: skip-
existing]
--extract / --no-extract Extract archives via `patool` \[default:
no-extract]
--extract-keep-source / --no-extract-keep-source
Keep the source archive when extracting
\[default: no-extract-keep-source]
--extract-ensure-subdir / --no-extract-ensure-subdir
Ensure a subdirectory with the package
filename when extracting \[default: no-
extract-ensure-subdir]
--exclude TEXT Exclude paths glob pattern
--include TEXT Include paths glob pattern
diff
Show documents diff for given version
Usage:
Options:
export
Export a complete dataset in LeakRFC format
Usage:
Options:
get
Retrieve a file from dataset archive and write to out uri (default: stdout)
Usage:
Options:
head
Retrieve a file info from dataset archive and write to out uri (default: stdout)
Usage:
Options:
ls
List all files in dataset archive
Usage:
Options:
-o TEXT \[default: -]
--keys / --no-keys Show only keys \[default: no-keys]
--checksums / --no-checksums Show only checksums \[default: no-checksums]
make
Make or update a leakrfc dataset and check integrity
Usage:
Options:
-o TEXT \[default: -]
--check-integrity / --no-check-integrity
Check checksums \[default: check-integrity]
--cleanup / --no-cleanup Cleanup (delete) unreferenced metadata
\[default: cleanup]
--metadata-only / --no-metadata-only
Check document metadata only \[default: no-
metadata-only]
--dataset-metadata-only / --no-dataset-metadata-only
Compute dataset metadata only \[default:
no-dataset-metadata-only]
memorious
Memorious related operations
Usage:
sync
Sync a memorious data store into a leakrfc dataset
Usage:
Options:
-i TEXT \[required]
--name-only / --no-name-only Use only file name as key \[default: no-name-
only]
--strip-prefix TEXT Strip from file key prefix
--key-template TEXT Template to generate key
versions
Show versions of dataset
Usage: