Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/maintenance_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -113,7 +113,7 @@ Options for the server configuration are:
|----------------|-----------|----------|---------------------------------------------------------------|
| authentication | ad_server | yes | Active directory server used for user authentication. |
| authentication | ad_domain | yes | Active directory domain used for user authentication. |
| authentication | ad_cert | yes | Path to the root ca certificate used for user authentication. |
| authentication | ad_cert | no | Path to the root ca certificate used for user authentication. |

### LDAP authentication options

Expand Down
65 changes: 52 additions & 13 deletions docs/tutorial.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ simdb --version
This should return something similar to:

```
simdb, version 0.4.0
simdb, version x.y.z
```

This indicates the CLI is available and shows what version has been installed.
Expand Down Expand Up @@ -123,16 +123,14 @@ metadata:
- description: |-
Baseline H-mode scenario simulation for ITER
15MA plasma current with Q=10 target
- reference_name: ITER_Baseline_2024
- ids_properties:
creation_date: '2024-12-05 10:30:00'
creation_date: "2024-12-05 10:30:00"
```

Metadata Best Practices:
- **machine**: Always specify the tokamak or device name
- **code**: Include both name and version for reproducibility
- **description**: Provide context about the simulation purpose and key features
- **reference_name**: Use a human-readable reference identifier
- **ids_properties**: Include creation date if not available in IDS data

### Validating a Manifest File
Expand Down Expand Up @@ -167,33 +165,74 @@ simdb simulation list
And the simulation you have just ingested with:

```bash
simdb simulation info test
simdb simulation info <SIM_ID>
```

## Pushing the simulation to remote server
## Pushing the simulation to a remote server

The SimDB client is able to communication with multiple remote servers. You can see which remote servers are available
on your local client using:

```bash
simdb remote --list
simdb remote config list
```

First, you will need to add the remote server and set it as default:
First, you will need to add the remote server and set it as default (name and url may differ):

```bash
simdb remote --new test https://simdb.iter.org/scenarios/api
simdb remote --set-default test
simdb remote --new iter https://simdb.iter.org/scenarios/api
simdb remote --set-default iter
```

You can now list the simulations available on the remote server:
You can test that the remote server is valid and also list the simulations available on it:

```bash
simdb remote list
simdb remote <REMOTE_ID> test
simdb remote <REMOTE_ID> list
```

You can now check that your simulation is valid for a given remote server, as different servers may have different
rules and required fields:

```bash
simdb simulation validate <REMOTE_ID> <SIM_ID>
```

Typical validation issues are:
- one of the data sources (input or output) being absent or not verifying the checksum (i.e. something changed since the ingestion);
- failing to comply with the list of mandatory metadata for the targeted remote.

It it possible to know which validation schema applies on a given remote:

```bash
simdb remote <REMOTE_ID> schema -d 10
```

If the `validate` command results in a `validation successful` message, then you can push your simulation:

```bash
simdb simulation push <REMOTE_ID> <SIM_ID>
```

If the simulation is expected to replace another one already present in the remote server:

```bash
simdb simulation push <REMOTE_ID> <SIM_ID> --replaces <PREVIOUS_SIM_ID>
```

The previous simulation will be marked as deprecated and contain a new `replaced_by` metadata that points to
`<SIM_ID>`. It's also possible to see the chained history of older versions if they exist:

```bash
simdb remote <REMOTE> trace <SIM_ID>
```


## Authentication

Whenever you run a remote command you will notice that you have to authenticate against the remote server. This can be
avoided by creating an authentication token using:
avoided by creating an authentication token for servers that allow such a method (not applicable for simdb.iter.org
which uses F5 firewall as authentication layer):

```bash
simdb remote token new
Expand Down
42 changes: 38 additions & 4 deletions docs/user_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,9 @@ Before diving into SimDB functionality, it's important to understand these key t

**Workflow**: Typically, you create and manage simulations locally, then push them to a remote SimDB server for sharing. The data referenced by your simulation can be either local (on your machine) or remote (on a data server).

**IMAS Access Layer compatibility**:
SimDB uses imas-python to read IMAS data. [imas-python](https://pypi.org/project/imas-python/) requires Access Layer 5 (AL5) or later and does not support the older Access Layer 4 (AL4). If your IMAS data was written using AL4 (e.g., MDSplus-based AL4 databases), you must convert it to AL5 format before use. See [AL4 MDSplus data migration](user_guide.md#al4-mdsplus-data-migration) below.

## Local simulation management

In order to ingest a local simulation you need a manifest file. This is a `yaml` file which contains details about the simulation and what data is associated with it. See the [Tutorial - Creating a simulation manifest](tutorial.md#creating-a-simulation-manifest) for detailed guidelines on how to create a well-formed manifest.
Expand All @@ -57,14 +60,15 @@ inputs:
- uri: file:///my/input/file
- uri: imas:hdf5?path=/path/to/imas/data
outputs:
- uri: file:///my/output/file
- uri: imas:hdf5?path=/path/to/more/data
metadata:
- machine: name of machine i.e. ITER.
- code:
name: code name i.e. ASTRA, JETTO, DINA, CORSICA, MITES, SOLPS, JINTRAC etc.
- description: |-
name: code name i.e. ASTRA, JETTO, DINA, CORSICA, METIS, SOLPS, JINTRAC etc.
version: code version
- description: |
Sample plasma physics simulation for ITER tokamak modeling
- reference_name: ITER simulation
- ids_properties:
creation_date: 'YYYY-MM-DD HH:mm:ss'
```
Expand All @@ -76,7 +80,17 @@ metadata:
| inputs/outputs | Lists of simulation input and output files. Supported URI schemes:<br/>• file - Standard file system paths<br/>• imas - IMAS entry URIs (see IMAS URI schema below) |
| metadata | Contains simulation metadata and properties. The metadata section associates information with the summary IDS data:<br/>• summary - A hierarchical dictionary structure containing key-value pairs that provide summary information extracted from IDS datasets. This includes condensed representations of simulation results, computed quantities, free descriptions, any references, and creation dates if not available in summary IDS.</li>

### IMAS URI schema
## Alias Naming Rules
<ul><li>Must be unique within the SimDB</ul></li>
<ul><li>Cannot start with a digit (0-9) or forward slash (/)</ul></li>
<ul><li>Cannot end with a forward slash (/)</ul></li>
<ul><li>Should be descriptive and meaningful for easy identification</ul></li>

Examples of valid aliases:
<ul><li>iter-baseline-scenario</ul></li>
<ul><li>100001/1 (pulse_number/run_number)</ul></li>

## IMAS URI schema

IMAS URIs specified in the manifest can either be in the form of remote data URIs or local data URIs.

Expand Down Expand Up @@ -128,6 +142,26 @@ Without port (uses default):

**Note:** Ensure that the specified port is accessible through your network firewall. Contact your system administrator if you experience connectivity issues.

## AL4 MDSplus data migration
SimDB uses [imas-python](https://pypi.org/project/imas-python/) to read IMAS data. `imas-python` requires Access Layer 5 (AL5) or later and **does not support the older Access Layer 4 (AL4).**

If you have existing IMAS data stored in an AL4 MDSplus, you must migrate it to the AL5 directory layout before referencing it in a SimDB manifest. This can be done using the `mdsplusIMASDB4to5` tool provided by IMAS-Core, which creates the new AL5 directory layout with links to the original data files (the original data is not removed).
```mdsplusIMASDB4to5 [-h] [--dry-run] [-p PATH] [-d DATABASE] [-f]```

| Options | Description |
|--------------------------------------|------------------------------------------------------------|
| `--dry-run` | Print actions but do not perform them |
| `-p PATH`, `--path PATH` | Specify path where imasdb to map (by default $HOME/public) |
| `-d DATABASE`, `--database DATABASE` | Specify a database to be map (by default all) |
| `-f`, `--force` | Force the creation of symlink even if the file exists |

Once the migration is complete, reference the new AL5 path in your manifest using the mdsplus backend:
```
outputs:
- uri: imas:mdsplus?path=<destination_path>
```
For further details on the `mdsplusIMASDB4to5` tool, refer to the IMAS-Core documentation.

## Remote SimDB servers

The SimDB CLI is able to interact with remote SimDB servers to push local simulations or to query existing simulations. This is done via the simdb remote command:
Expand Down
20 changes: 0 additions & 20 deletions src/simdb/database/models/file.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,14 +24,9 @@ class File(Base):
__tablename__ = "files"
id = Column(sql_types.Integer, primary_key=True)
uuid = Column(UUID, nullable=False, unique=True, index=True)
usage = Column(sql_types.String(250), nullable=True)
uri: urilib.URI = Column(URI(1024), nullable=True)
checksum = Column(sql_types.String(64), nullable=True)
type: DataObject.Type = Column(sql_types.Enum(DataObject.Type), nullable=True)
purpose = Column(sql_types.String(250), nullable=True)
sensitivity = Column(sql_types.String(20), nullable=True)
access = Column(sql_types.String(20), nullable=True)
embargo = Column(sql_types.String(20), nullable=True)
datetime = Column(sql_types.DateTime, nullable=False)

def __init__(
Expand All @@ -54,14 +49,9 @@ def __str__(self):
result = ""
for name in (
"uuid",
"usage",
"uri",
"checksum",
"type",
"purpose",
"sensitivity",
"access",
"embargo",
"datetime",
):
result += " %s:%s%s\n" % (
Expand Down Expand Up @@ -114,26 +104,16 @@ def from_data(cls, data: Dict) -> "File":
DataObject.Type[data_type], urilib.URI(uri), perform_integrity_check=False
)
file.uuid = checked_get(data, "uuid", uuid.UUID)
file.usage = checked_get(data, "usage", str, optional=True)
file.checksum = checked_get(data, "checksum", str)
file.purpose = checked_get(data, "purpose", str, optional=True)
file.sensitivity = checked_get(data, "sensitivity", str, optional=True)
file.access = checked_get(data, "access", str, optional=True)
file.embargo = checked_get(data, "embargo", str, optional=True)
file.datetime = date_parser.parse(checked_get(data, "datetime", str))
return file

def data(self, recurse: bool = False) -> Dict[str, str]:
data = dict(
uuid=self.uuid,
usage=self.usage,
uri=str(self.uri),
checksum=self.checksum,
type=self.type.name,
purpose=self.purpose,
sensitivity=self.sensitivity,
access=self.access,
embargo=self.embargo,
datetime=self.datetime.isoformat(),
)
return data
2 changes: 1 addition & 1 deletion src/simdb/database/models/simulation.py
Original file line number Diff line number Diff line change
Expand Up @@ -140,7 +140,7 @@ def __init__(

for ids in idss:
ids_name, occurrence = extract_ids_occurrence(ids)
check_time(entry, ids, occurrence)
check_time(entry, ids_name, occurrence)

all_input_idss += idss

Expand Down
2 changes: 1 addition & 1 deletion src/simdb/remote/apis/v1/simulations.py
Original file line number Diff line number Diff line change
Expand Up @@ -144,7 +144,7 @@ class SimulationList(Resource):
def get(self, user: User):
from ....query import QueryType, parse_query_arg

limit = int(request.headers.get(SimulationList.LIMIT_HEADER, 100))
limit = int(request.headers.get(APIConstants.LIMIT_HEADER, 100))
page = 1
names = []
constraints = []
Expand Down
9 changes: 4 additions & 5 deletions src/simdb/remote/apis/v1_1/simulations.py
Original file line number Diff line number Diff line change
Expand Up @@ -159,12 +159,11 @@ class SimulationList(Resource):
def get(self, user: User):
from ....query import QueryType, parse_query_arg

limit = int(request.headers.get(SimulationList.LIMIT_HEADER, 100))
page = int(request.headers.get(SimulationList.PAGE_HEADER, 1))
sort_by = request.headers.get(SimulationList.SORT_BY_HEADER, "")
limit = int(request.headers.get(APIConstants.LIMIT_HEADER, 100))
page = int(request.headers.get(APIConstants.PAGE_HEADER, 1))
sort_by = request.headers.get(APIConstants.SORT_BY_HEADER, "")
sort_asc = (
request.headers.get(SimulationList.SORT_ASC_HEADER, "false").lower()
== "true"
request.headers.get(APIConstants.SORT_ASC_HEADER, "false").lower() == "true"
)
names = []
constraints = []
Expand Down
4 changes: 3 additions & 1 deletion src/simdb/remote/core/auth/active_directory.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,9 @@ def authenticate(
ad_config = {
"AD_SERVER": config.get_option("authentication.ad_server"),
"AD_DOMAIN": config.get_option("authentication.ad_domain"),
"AD_CA_CERT_FILE": config.get_option("authentication.ad_cert"),
"AD_CA_CERT_FILE": config.get_option(
"authentication.ad_cert", default=""
),
}
ad = EasyAD(ad_config)
except (KeyError, ImportError):
Expand Down
9 changes: 9 additions & 0 deletions validation/iter_scenarios_validation.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -14,25 +14,33 @@ code:
schema:
name:
type: string
required: true
time:
type: numpy
coerce: numpy
description:
type: string
required: true
global_quantities:
type: dict
required: true
schema:
b0:
type: dict
required: true
schema:
value:
type: numpy
required: true
coerce: numpy
lt: 0
r0:
type: dict
required: true
schema:
value:
type: float
required: true
# coerce: numpy
gt: 0
beta_pol:
Expand Down Expand Up @@ -100,6 +108,7 @@ global_quantities:
schema:
value:
type: numpy
required: true
coerce: numpy
ge: -17000000
le: 0
Expand Down
Loading