diff --git a/docs/maintenance_guide.md b/docs/maintenance_guide.md index 557878c6..e5664af2 100644 --- a/docs/maintenance_guide.md +++ b/docs/maintenance_guide.md @@ -113,7 +113,7 @@ Options for the server configuration are: |----------------|-----------|----------|---------------------------------------------------------------| | authentication | ad_server | yes | Active directory server used for user authentication. | | authentication | ad_domain | yes | Active directory domain used for user authentication. | -| authentication | ad_cert | yes | Path to the root ca certificate used for user authentication. | +| authentication | ad_cert | no | Path to the root ca certificate used for user authentication. | ### LDAP authentication options diff --git a/docs/tutorial.md b/docs/tutorial.md index 67135a0c..5de45e84 100644 --- a/docs/tutorial.md +++ b/docs/tutorial.md @@ -14,7 +14,7 @@ simdb --version This should return something similar to: ``` -simdb, version 0.4.0 +simdb, version x.y.z ``` This indicates the CLI is available and shows what version has been installed. @@ -123,16 +123,14 @@ metadata: - description: |- Baseline H-mode scenario simulation for ITER 15MA plasma current with Q=10 target - - reference_name: ITER_Baseline_2024 - ids_properties: - creation_date: '2024-12-05 10:30:00' + creation_date: "2024-12-05 10:30:00" ``` Metadata Best Practices: - **machine**: Always specify the tokamak or device name - **code**: Include both name and version for reproducibility - **description**: Provide context about the simulation purpose and key features -- **reference_name**: Use a human-readable reference identifier - **ids_properties**: Include creation date if not available in IDS data ### Validating a Manifest File @@ -167,33 +165,74 @@ simdb simulation list And the simulation you have just ingested with: ```bash -simdb simulation info test +simdb simulation info ``` -## Pushing the simulation to remote server +## Pushing the simulation to a remote server The SimDB client is able to communication with multiple remote servers. You can see which remote servers are available on your local client using: ```bash -simdb remote --list +simdb remote config list ``` -First, you will need to add the remote server and set it as default: +First, you will need to add the remote server and set it as default (name and url may differ): ```bash -simdb remote --new test https://simdb.iter.org/scenarios/api -simdb remote --set-default test +simdb remote --new iter https://simdb.iter.org/scenarios/api +simdb remote --set-default iter ``` -You can now list the simulations available on the remote server: +You can test that the remote server is valid and also list the simulations available on it: ```bash -simdb remote list +simdb remote test +simdb remote list ``` +You can now check that your simulation is valid for a given remote server, as different servers may have different +rules and required fields: + +```bash +simdb simulation validate +``` + +Typical validation issues are: +- one of the data sources (input or output) being absent or not verifying the checksum (i.e. something changed since the ingestion); +- failing to comply with the list of mandatory metadata for the targeted remote. + +It it possible to know which validation schema applies on a given remote: + +```bash +simdb remote schema -d 10 +``` + +If the `validate` command results in a `validation successful` message, then you can push your simulation: + +```bash +simdb simulation push +``` + +If the simulation is expected to replace another one already present in the remote server: + +```bash +simdb simulation push --replaces +``` + +The previous simulation will be marked as deprecated and contain a new `replaced_by` metadata that points to +``. It's also possible to see the chained history of older versions if they exist: + +```bash +simdb remote trace +``` + + +## Authentication + Whenever you run a remote command you will notice that you have to authenticate against the remote server. This can be -avoided by creating an authentication token using: +avoided by creating an authentication token for servers that allow such a method (not applicable for simdb.iter.org +which uses F5 firewall as authentication layer): ```bash simdb remote token new diff --git a/docs/user_guide.md b/docs/user_guide.md index 8e6f3f88..f8da81d3 100644 --- a/docs/user_guide.md +++ b/docs/user_guide.md @@ -44,6 +44,9 @@ Before diving into SimDB functionality, it's important to understand these key t **Workflow**: Typically, you create and manage simulations locally, then push them to a remote SimDB server for sharing. The data referenced by your simulation can be either local (on your machine) or remote (on a data server). +**IMAS Access Layer compatibility**: +SimDB uses imas-python to read IMAS data. [imas-python](https://pypi.org/project/imas-python/) requires Access Layer 5 (AL5) or later and does not support the older Access Layer 4 (AL4). If your IMAS data was written using AL4 (e.g., MDSplus-based AL4 databases), you must convert it to AL5 format before use. See [AL4 MDSplus data migration](user_guide.md#al4-mdsplus-data-migration) below. + ## Local simulation management In order to ingest a local simulation you need a manifest file. This is a `yaml` file which contains details about the simulation and what data is associated with it. See the [Tutorial - Creating a simulation manifest](tutorial.md#creating-a-simulation-manifest) for detailed guidelines on how to create a well-formed manifest. @@ -57,14 +60,15 @@ inputs: - uri: file:///my/input/file - uri: imas:hdf5?path=/path/to/imas/data outputs: +- uri: file:///my/output/file - uri: imas:hdf5?path=/path/to/more/data metadata: - machine: name of machine i.e. ITER. - code: - name: code name i.e. ASTRA, JETTO, DINA, CORSICA, MITES, SOLPS, JINTRAC etc. -- description: |- + name: code name i.e. ASTRA, JETTO, DINA, CORSICA, METIS, SOLPS, JINTRAC etc. + version: code version +- description: | Sample plasma physics simulation for ITER tokamak modeling -- reference_name: ITER simulation - ids_properties: creation_date: 'YYYY-MM-DD HH:mm:ss' ``` @@ -76,7 +80,17 @@ metadata: | inputs/outputs | Lists of simulation input and output files. Supported URI schemes:
• file - Standard file system paths
• imas - IMAS entry URIs (see IMAS URI schema below) | | metadata | Contains simulation metadata and properties. The metadata section associates information with the summary IDS data:
• summary - A hierarchical dictionary structure containing key-value pairs that provide summary information extracted from IDS datasets. This includes condensed representations of simulation results, computed quantities, free descriptions, any references, and creation dates if not available in summary IDS. -### IMAS URI schema +## Alias Naming Rules +
  • Must be unique within the SimDB
+
  • Cannot start with a digit (0-9) or forward slash (/)
+
  • Cannot end with a forward slash (/)
+
  • Should be descriptive and meaningful for easy identification
+ +Examples of valid aliases: +
  • iter-baseline-scenario
+
  • 100001/1 (pulse_number/run_number)
+ +## IMAS URI schema IMAS URIs specified in the manifest can either be in the form of remote data URIs or local data URIs. @@ -128,6 +142,26 @@ Without port (uses default): **Note:** Ensure that the specified port is accessible through your network firewall. Contact your system administrator if you experience connectivity issues. +## AL4 MDSplus data migration +SimDB uses [imas-python](https://pypi.org/project/imas-python/) to read IMAS data. `imas-python` requires Access Layer 5 (AL5) or later and **does not support the older Access Layer 4 (AL4).** + +If you have existing IMAS data stored in an AL4 MDSplus, you must migrate it to the AL5 directory layout before referencing it in a SimDB manifest. This can be done using the `mdsplusIMASDB4to5` tool provided by IMAS-Core, which creates the new AL5 directory layout with links to the original data files (the original data is not removed). +```mdsplusIMASDB4to5 [-h] [--dry-run] [-p PATH] [-d DATABASE] [-f]``` + +| Options | Description | +|--------------------------------------|------------------------------------------------------------| +| `--dry-run` | Print actions but do not perform them | +| `-p PATH`, `--path PATH` | Specify path where imasdb to map (by default $HOME/public) | +| `-d DATABASE`, `--database DATABASE` | Specify a database to be map (by default all) | +| `-f`, `--force` | Force the creation of symlink even if the file exists | + +Once the migration is complete, reference the new AL5 path in your manifest using the mdsplus backend: +``` +outputs: +- uri: imas:mdsplus?path= +``` +For further details on the `mdsplusIMASDB4to5` tool, refer to the IMAS-Core documentation. + ## Remote SimDB servers The SimDB CLI is able to interact with remote SimDB servers to push local simulations or to query existing simulations. This is done via the simdb remote command: diff --git a/src/simdb/database/models/file.py b/src/simdb/database/models/file.py index 1d60646d..e7543ee4 100644 --- a/src/simdb/database/models/file.py +++ b/src/simdb/database/models/file.py @@ -24,14 +24,9 @@ class File(Base): __tablename__ = "files" id = Column(sql_types.Integer, primary_key=True) uuid = Column(UUID, nullable=False, unique=True, index=True) - usage = Column(sql_types.String(250), nullable=True) uri: urilib.URI = Column(URI(1024), nullable=True) checksum = Column(sql_types.String(64), nullable=True) type: DataObject.Type = Column(sql_types.Enum(DataObject.Type), nullable=True) - purpose = Column(sql_types.String(250), nullable=True) - sensitivity = Column(sql_types.String(20), nullable=True) - access = Column(sql_types.String(20), nullable=True) - embargo = Column(sql_types.String(20), nullable=True) datetime = Column(sql_types.DateTime, nullable=False) def __init__( @@ -54,14 +49,9 @@ def __str__(self): result = "" for name in ( "uuid", - "usage", "uri", "checksum", "type", - "purpose", - "sensitivity", - "access", - "embargo", "datetime", ): result += " %s:%s%s\n" % ( @@ -114,26 +104,16 @@ def from_data(cls, data: Dict) -> "File": DataObject.Type[data_type], urilib.URI(uri), perform_integrity_check=False ) file.uuid = checked_get(data, "uuid", uuid.UUID) - file.usage = checked_get(data, "usage", str, optional=True) file.checksum = checked_get(data, "checksum", str) - file.purpose = checked_get(data, "purpose", str, optional=True) - file.sensitivity = checked_get(data, "sensitivity", str, optional=True) - file.access = checked_get(data, "access", str, optional=True) - file.embargo = checked_get(data, "embargo", str, optional=True) file.datetime = date_parser.parse(checked_get(data, "datetime", str)) return file def data(self, recurse: bool = False) -> Dict[str, str]: data = dict( uuid=self.uuid, - usage=self.usage, uri=str(self.uri), checksum=self.checksum, type=self.type.name, - purpose=self.purpose, - sensitivity=self.sensitivity, - access=self.access, - embargo=self.embargo, datetime=self.datetime.isoformat(), ) return data diff --git a/src/simdb/database/models/simulation.py b/src/simdb/database/models/simulation.py index fcb3f5a6..452eef27 100644 --- a/src/simdb/database/models/simulation.py +++ b/src/simdb/database/models/simulation.py @@ -140,7 +140,7 @@ def __init__( for ids in idss: ids_name, occurrence = extract_ids_occurrence(ids) - check_time(entry, ids, occurrence) + check_time(entry, ids_name, occurrence) all_input_idss += idss diff --git a/src/simdb/remote/apis/v1/simulations.py b/src/simdb/remote/apis/v1/simulations.py index de987b05..2ded5694 100644 --- a/src/simdb/remote/apis/v1/simulations.py +++ b/src/simdb/remote/apis/v1/simulations.py @@ -144,7 +144,7 @@ class SimulationList(Resource): def get(self, user: User): from ....query import QueryType, parse_query_arg - limit = int(request.headers.get(SimulationList.LIMIT_HEADER, 100)) + limit = int(request.headers.get(APIConstants.LIMIT_HEADER, 100)) page = 1 names = [] constraints = [] diff --git a/src/simdb/remote/apis/v1_1/simulations.py b/src/simdb/remote/apis/v1_1/simulations.py index 276c62cc..8580b54d 100644 --- a/src/simdb/remote/apis/v1_1/simulations.py +++ b/src/simdb/remote/apis/v1_1/simulations.py @@ -159,12 +159,11 @@ class SimulationList(Resource): def get(self, user: User): from ....query import QueryType, parse_query_arg - limit = int(request.headers.get(SimulationList.LIMIT_HEADER, 100)) - page = int(request.headers.get(SimulationList.PAGE_HEADER, 1)) - sort_by = request.headers.get(SimulationList.SORT_BY_HEADER, "") + limit = int(request.headers.get(APIConstants.LIMIT_HEADER, 100)) + page = int(request.headers.get(APIConstants.PAGE_HEADER, 1)) + sort_by = request.headers.get(APIConstants.SORT_BY_HEADER, "") sort_asc = ( - request.headers.get(SimulationList.SORT_ASC_HEADER, "false").lower() - == "true" + request.headers.get(APIConstants.SORT_ASC_HEADER, "false").lower() == "true" ) names = [] constraints = [] diff --git a/src/simdb/remote/core/auth/active_directory.py b/src/simdb/remote/core/auth/active_directory.py index c1e8aecf..534ff0cd 100644 --- a/src/simdb/remote/core/auth/active_directory.py +++ b/src/simdb/remote/core/auth/active_directory.py @@ -27,7 +27,9 @@ def authenticate( ad_config = { "AD_SERVER": config.get_option("authentication.ad_server"), "AD_DOMAIN": config.get_option("authentication.ad_domain"), - "AD_CA_CERT_FILE": config.get_option("authentication.ad_cert"), + "AD_CA_CERT_FILE": config.get_option( + "authentication.ad_cert", default="" + ), } ad = EasyAD(ad_config) except (KeyError, ImportError): diff --git a/validation/iter_scenarios_validation.yaml b/validation/iter_scenarios_validation.yaml index 0001c4e9..eb5cdd57 100644 --- a/validation/iter_scenarios_validation.yaml +++ b/validation/iter_scenarios_validation.yaml @@ -14,11 +14,16 @@ code: schema: name: type: string + required: true time: type: numpy coerce: numpy +description: + type: string + required: true global_quantities: type: dict + required: true schema: b0: type: dict @@ -26,13 +31,16 @@ global_quantities: schema: value: type: numpy + required: true coerce: numpy lt: 0 r0: type: dict + required: true schema: value: type: float + required: true # coerce: numpy gt: 0 beta_pol: @@ -100,6 +108,7 @@ global_quantities: schema: value: type: numpy + required: true coerce: numpy ge: -17000000 le: 0