Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
8533e68
Configurable host and paths via environment
andaBarbu Apr 22, 2026
c1f6c5b
Configurable host and paths via environment
andaBarbu Apr 23, 2026
6cc66e6
h
andaBarbu May 7, 2026
556263b
j
andaBarbu May 8, 2026
b91d97c
remote-missing_shuttingdown
andaBarbu May 12, 2026
f0717f1
remote_mode_before_testing
andaBarbu May 15, 2026
3a2d52e
add coments to the remote
andaBarbu May 18, 2026
98c100d
README file change
andaBarbu May 18, 2026
bcd146f
Update DitributeOrgestor.py
andaBarbu May 25, 2026
547760e
test remote_distribution
andaBarbu Jun 11, 2026
bb067cd
Validation_of_the_setUp
andaBarbu Jun 14, 2026
f2886b2
energyvalidator
andaBarbu Jun 15, 2026
d258f41
test_energyValidator
andaBarbu Jun 15, 2026
bc96684
clean
andaBarbu Jun 16, 2026
72d033a
clean
andaBarbu Jun 16, 2026
4febc50
clean
andaBarbu Jun 16, 2026
6d486e9
ADB_VALIDATION_REQUIRMENTSCHECKING
andaBarbu Jun 18, 2026
e8e61cc
Old_code
andaBarbu Jun 19, 2026
d2f0ffb
Checking for anomalies feature *improved*
andaBarbu Jun 22, 2026
84ba8e5
update
andaBarbu Jun 22, 2026
44d8f62
integarte the result validator into the reomote distribution feature
andaBarbu Jun 22, 2026
4efc4f3
chanege in the exeperiments exemples
andaBarbu Jun 22, 2026
c7fafd8
inished adb_no tests
andaBarbu Jun 23, 2026
0ed985e
finsied adb - no tests
andaBarbu Jun 23, 2026
23e4442
redone_testings
andaBarbu Jun 23, 2026
311cef2
comments
andaBarbu Jun 23, 2026
0aff279
Anomalies small look waise change
andaBarbu Jun 23, 2026
e03f401
Merge pull request #1 from AndaGB/Anda_Gabriela_Barbu
andaBarbu Jun 24, 2026
cb099d5
I changed where Validation_EXPERIMENT is Calleds
andaBarbu Jun 24, 2026
c493d6d
Added the CONTINUE hook
andaBarbu Jun 24, 2026
04a7b88
chnage in anomalies checker
andaBarbu Jun 24, 2026
f6b56b5
Merge pull request #2 from AndaGB/Anda_Gabriela_Barbu
andaBarbu Jun 24, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added .coverage
Binary file not shown.
62 changes: 62 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,8 +76,70 @@ python experiment-runner/ <MyRunnerConfig.py>

The results of the experiment will be stored in the directory `RunnerConfig.results_output_path/RunnerConfig.name` as defined by your config variables.

### Portability Across Users and Machines

When sharing experiments across different users or machines, hardcoded paths in configuration files can cause issues. Experiment Runner supports **environment variables** to make your experiments portable without code changes:

#### Available Environment Variables

- **`EXPERIMENT_RUNNER_OUTPUT_PATH`**: Directory where experiment results are stored
- Default: `<config-directory>/experiments`
- Example: `export EXPERIMENT_RUNNER_OUTPUT_PATH="/path/to/results"`

- **`ENERGIBRIDGE_PATH`**: Path to the EnergiBridge executable (for energy measurements)
- Default: `/usr/local/bin/energibridge`
- Example: `export ENERGIBRIDGE_PATH="/usr/local/bin/energibridge"`

- **`EXAMPLES_PATH`**: Directory for generating new config templates
- Default: `<project-root>/examples`
- Example: `export EXAMPLES_PATH="/home/user/my-experiments"`

#### Using Environment Variables

Set environment variables before running your experiment:

```bash
export EXPERIMENT_RUNNER_OUTPUT_PATH="/data/experiments"
export ENERGIBRIDGE_PATH="/opt/energibridge/bin/energibridge"
python experiment-runner/ MyRunnerConfig.py
```

Your configuration files automatically use these variables if set, with sensible defaults when they are not. This allows the same experiment to run on different machines without any code modifications.

**More information about the profilers and use cases can be found in the [Wiki tab](https://github.com/S2-group/experiment-runner/wiki).**

---
## Remote distribution

Experiment Runner supports **distributed execution across multiple machines** using a master–worker architecture.

### Architecture Overview

- One machine acts as the **Master (Orchestrator)**
- Owns the experiment `run_table`
- Assigns runs to workers via a REST API
- Tracks progress and persists experiment state
- Triggers lifecycle events (e.g. `AFTER_EXPERIMENT`) when finished

- Multiple machines act as **Workers**
- Request tasks from the master
- Execute runs locally using the configured experiment
- Submit results back to the master

- Communication between master and workers is handled via a lightweight **Flask-based HTTP API**

### How to run it
Start the orchestrator on the master machine:
```bash
python experiment-runner/ examples/<example-dir>/<RunnerConfig*.py> --distribute master <host host_nr --port port_nr>
```
On each worker machine, connect to the master:
```bash
experiment-runner/ examples/<example-dir>/<RunnerConfig*.py> --distribute worker --master <orchestor_adress>
```
When the experiment finish it, the master would close automatically, the rest of the workers would need manually closing, they would close after 120s


## How to cite Experiment Runner

If Experiment Runner is helping your research, consider to cite it as follows, thank you!
Expand Down
139 changes: 139 additions & 0 deletions Troubleshoating.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
# Troubleshooting

## 1. Python Package Installation Error

When installing and setting up `experiment-runner`, one common issue is running:

```bash
pip3 install -r requirments.txt
```

and getting the following error:

```text
error: externally-managed-environment

× This environment is externally managed
╰─> To install Python packages system-wide, try apt install
python3-xyz
```

Some Linux distributions (especially Ubuntu 24+, Debian, and Fedora) protect the system Python installation to avoid breaking system packages.

### Solution

Run:

```bash
pip3 install -r requirments.txt --break-system-packages
```

### Alternative

Use a Python virtual environment:

```bash
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
```

---

## 2. EnergiBridge / JoularCore Permission Error

When using EnergiBridge or JoularCore on Linux systems (especially AMD CPUs), you may encounter the following error when running the experiment:

```text
thread 'main' (33575) panicked at src/cpu/amd.rs:20:76:
called `Result::unwrap()` on an `Err` value: Os { code: 13, kind: PermissionDenied, message: "Permission denied" }
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
```

The Rust profiler is trying to access low-level CPU energy counters (MSR / RAPL interfaces), but Linux blocks access for normal users.

### Solution

#### 1. Load the MSR Kernel Module

Run:

```bash
sudo modprobe msr
```

Then verify the device exists:

```bash
ls /dev/cpu/0/msr
```

Expected output:

```text
/dev/cpu/0/msr
```

If the file does not exist, the kernel module did not load correctly.

---

#### 2. Check MSR Permissions

Run:

```bash
ls -l /dev/cpu/0/msr
```

If you see something similar to:

```text
crw------- 1 root root
```

then only the root user can access the CPU energy counters.

---

#### 3. Grant Read Permissions

Run:

```bash
sudo chmod o+r /dev/cpu/*/msr
```

This temporarily allows non-root users to read the MSR registers.

---

#### If Nothing Works

Some Linux systems completely block low-level profiling access.

Run:

```bash
cat /proc/sys/kernel/perf_event_paranoid
```

If the value is:

```text
2
3
4
```

then Linux is blocking low-level performance counters.

#### Temporary Fix

Run:

```bash
echo -1 | sudo tee /proc/sys/kernel/perf_event_paranoid
```

This temporarily lowers the kernel restrictions and allows profiling tools to access hardware counters.
2 changes: 1 addition & 1 deletion examples/hello-world-fibonacci/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,6 @@ python experiment-runner/ examples/hello-world-fibonacci/RunnerConfig.py
## Results

The results are generated in the `examples/hello-world-fibonacci/experiments` folder.

In case there are anomalies such as null, absent, or negative values, a report will be generated in the `examples/hello-world-fibonacci/experiments` folder.
**!!! WARNING !!!**: COLUMNS IN THE `energibridge.csv` FILES CAN BE DIFFERENT ACROSS MACHINES.
ADJUST THE DATAFRAME COLUMN NAMES ACCORDINGLY.
Loading