This is an automated email from the ASF dual-hosted git repository.
eamonford pushed a commit to branch update-docs
in repository https://gitbox.apache.org/repos/asf/incubator-sdap-ingester.git
commit ca3898e14aab479c5578b5f8e2a1a72183436c60
Author: Eamon Ford <eamonford@gmail.com>
AuthorDate: Mon Dec 7 13:58:59 2020 -0800
Update Collection Manager readme
---
collection_manager/README.md | 80 ++++++++++++++++++++++++++++++++------------
1 file changed, 59 insertions(+), 21 deletions(-)
diff --git a/collection_manager/README.md b/collection_manager/README.md
index 84df468..bc630cd 100644
--- a/collection_manager/README.md
+++ b/collection_manager/README.md
@@ -26,7 +26,7 @@ From `incubator-sdap-ingester`, run:
A path to a collections configuration file must be passed in to the Collection Manager
at startup via the `--collections-path` parameter. Below is an example of what the
-collections configuration file should look like:
+collections configuration file could look like:
```yaml
# collections.yaml
@@ -34,35 +34,73 @@ collections configuration file should look like:
collections:
# The identifier for the dataset as it will appear in NEXUS.
- - id: TELLUS_GRACE_MASCON_CRI_GRID_RL05_V2_LAND
+ - id: "CSR-RL06-Mascons_LAND"
- # The local path to watch for NetCDF granule files to be associated with this dataset.
- # Supports glob-style patterns.
- path: /opt/data/grace/*land*.nc
-
- # The name of the NetCDF variable to read when ingesting granules into NEXUS for this
dataset.
- variable: lwe_thickness
+ # The path to watch for NetCDF granule files to be associated with this dataset.
+ # This can also be an S3 path prefix, for example "s3://my-bucket/path/to/granules/"
+ path: "/data/CSR-RL06-Mascons-land/"
# An integer priority level to use when publishing messages to RabbitMQ for historical
data.
- # Higher number = higher priority.
- priority: 1
+ # Higher number = higher priority. Scale is 1-10.
+ priority: 1
# An integer priority level to use when publishing messages to RabbitMQ for forward-processing
data.
- # Higher number = higher priority.
+ # Higher number = higher priority. Scale is 1-10.
forward-processing-priority: 5
- - id: TELLUS_GRACE_MASCON_CRI_GRID_RL05_V2_OCEAN
- path: /opt/data/grace/*ocean*.nc
- variable: lwe_thickness
- priority: 2
- forward-processing-priority: 6
+ # The type of project to use when processing granules in this collection.
+ # Accepted values are Grid, ECCO, TimeSeries, or Swath.
+ projection: Grid
+
+ dimensionNames:
+ # The name of the primary variable
+ variable: lwe_thickness
+
+ # The name of the latitude variable
+ latitude: lat
+
+ # The name of the longitude variable
+ longitude: lon
+
+ # The name of the depth variable (only include if depth variable exists)
+ depth: Z
+
+ # The name of the time variable (only include if time variable exists)
+ time: Time
+
+ # This section is an index of each dimension on which the primary variable is dependent,
mapped to their desired slice sizes.
+ slices:
+ Z: 1
+ Time: 1
+ lat: 60
+ lon: 60
+
+ - id: ocean-bottom-pressure
+ path: /data/OBP/
+ priority: 6
+ forward-processing-priority: 7
+ projection: ECCO
+ dimensionNames:
+ latitude: YC
+ longitude: XC
+ time: time
+ # "tile" is required when using the ECCO projection. This refers to the name of the
dimension containing the ECCO tile index.
+ tile: tile
+ variable: OBP
+ slices:
+ time: 1
+ tile: 1
+ i: 30
+ j: 30
+```
- - id: AVHRR_OI-NCEI-L4-GLOB-v2.0
- path: /opt/data/avhrr/*.nc
- variable: analysed_sst
- priority: 1
+Note that the dimensions listed under `slices` will not necessarily match those under `dimensionNames`.
This is because sometimes
+the actual dimensions are referenced by index variables.
+> **Tip:** An easy way to determine which variables go under `dimensionNames` and which
ones go under `slices` is that the variables
+> on which the primary variable is dependent should go under `slices`, and the variables
on which _those_ variables are dependent
+> (which could be themselves, as in the case of the first collection in the above example)
should go under `dimensionNames`. The excepction
+> to this is that the primary variable is always listed under `dimensionNames.variable`.
-```
## Running the tests
From `incubator-sdap-ingester/`, run:
|