Install & run
The easiest way to run this project is by using Docker compose:
git clone https://gitlab.com/panoramax/server/meta-catalog.git
cd meta-catalog/
docker compose up -p meta-catalog --build
Then, the meta-catalog is accessible at localhost:9000 (without any data for the moment).
You can also run each service without docker, as is explained in each services documentation.
Database & migrations
Each service will need an accessible postgis database.
The database will be provided and its schema updated if you use the docker compose approach, else you need to provide it.
Note: in all the following examples, the database will be on localhost:5432
, named as panoramax
and accessed with username:password
(thus the connection string will be postgresql://username:password@localhost:5432/panoramax
).
When using docker compose, the database schema is updated automatically by the migrations
service.
Loading data
The harvester
directory contains to code needed to harvest the data from several instances into the meta-catalog.
The harvester
depend on a simple CSV file to describe to instances to crawl.
The CSV should have the column:
id
: identifier of the instanceapi_url
: Root url of the stac catalog. Note: for geovisio instances, don't forget the trailing/api
You can check the config file of french public panoramax instances for an example.
It's easy, just provide the path to the CSV file to the INSTANCES_FILE
environment variable and run the additional docker compose file containing the harvester and a scheduler (like a cron):
INSTANCES_FILE=<CSV_FILE> docker compose -p meta-catalog -f docker-compose.yml -f docker-compose-harvester.yml up -d
This will crawl the requested instances every 5mn.
You can also just run the python code directly:
cd harvester
python -m pip install --upgrade virtualenv
virtualenv .venv
source .venv/bin/activate
pip install -e .
stac-harvester harvest --db "postgresql://username:password@localhost:5432/panoramax" --instances-files ../panoramax-fr-instances.csv --target-hostname=https://panoramax.fr/api --collections-limit=5
This will import 5 collections from the panoramax.ign.fr instance into your database.
Note: the parameter --target-hostname
is important since all STAC links from the crawled API will be replaced with this hostname. This should be the public hostname of your meta-catalog API. But your meta-catalog can be served behind several URLS or behind a proxy, since the API will respect the Host
header provided for all its STAC links.
You can also give the parameters without a configuration file:
stac-harvester harvest --db "postgresql://username:password@localhost:5432/panoramax" --instance-name ign --instance-url https://panoramax.ign.fr/api --target-hostname=https://panoramax.fr/api
All CLI parameters can be listed with stac-harvester --help
.
If you want to run a systemd service to craw automatically, you can check the detailed documentation section.
Instance configuration
The instance configuration (from the instance's /configuration
endpoint) is synchronized every day in the database.
It can also be updated manually by running:.
cd harvester
python -m pip install --upgrade virtualenv
virtualenv .venv
source .venv/bin/activate
pip install -e .
stac-harvester sync-configuration --db "postgresql://username:password@localhost:5432/panoramax" --instance instance_name --instance another_instance_name
Install standalone API
The API is written in Rust.
The API is started as the api
service in the docker compose file.
Info
You need Rust to build the API. The best is to follow the official documentation, but if you want a quick way to do this:
Note
You should have an up to date Rust version, as least 1.81.0.
To build and run the API:
If you want to run the API as a systemd service, you can use and adapt the example configuration file.