Skip to content

GazzolaLab/SoftArmRC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

3D Soft-Arm Tracking with Reservoir Computing

Supplementary code for:

Neural reservoir control of a bio-hybrid soft arm
Naughton N, Tekinapl A, Shivam K, Kim SH, Kindratenko K, and Gazzola M, Proceedings of the National Academy of Sciences, 2026.
DOI: 10.1073/pnas.2522094123]

This repository contains the simulation and reinforcement learning code used to train and evaluate two reservoir computing neural network architectures for 3D soft-arm target tracking tasks.


Repository Structure

3D_tracking/
├── core/                          # Shared simulation framework
│   ├── elastica/                  # PyElastica soft-body physics package
│   ├── ArmMuscleModel/            # Soft arm model with pressure-driven muscles
│   ├── MuscleTorquesWithBspline/  # Muscle torque model
│   ├── core_functions.py          # Shared RL environment logic
│   └── set_environment_muscles.py # Elastica environment setup
│
├── rc_ann/                        # Reservoir Computing (ANN readout) policy
│   ├── policy_train.py
│   ├── policy_evaluate.py
│   ├── input_parameters.json
│   ├── weights.zip                # Reservoir weight matrices (W_reservoir.npy, W_in.npy)
│   └── POLICY.zip                 # Pre-trained policy weights
│
├── rc_spiking/                    # Reservoir Computing (Spiking NN readout) policy
│   ├── policy_train.py
│   ├── policy_evaluate.py
│   ├── run_eval.py
│   ├── core_functions.py          # Extends shared core with Nengo/spiking logic
│   ├── input_parameters.json
│   ├── nengo_loihi/               # Nengo-loihi library
│   ├── weights.zip                # Spiking reservoir weight matrices (pkl files)
│   └── POLICY.zip
│
├── requirements.txt
├── LICENSE
└── README.md

Both versions use the same arm stiffness (Young's modulus scaling E = 0.125) and differ only in the neural network architecture used for the RL policy readout.


Installation

Requirements

  • Python 3.8 (other versions not tested)

Steps

  1. Create a Python 3.8 environment (conda recommended):

    conda create -n 3d_tracking python=3.8
    conda activate 3d_tracking
  2. Install dependencies:

    pip install -r requirements.txt

Weight files: Each directory contains a weights.zip containing the reservoir weights (which are pre-generated and not trained). They are automatically extracted the first time you run policy_evaluate.py or policy_train.py — no manual step needed.


Usage

All scripts must be run from inside the respective directory so relative paths (to input_parameters.json, weight files, etc.) resolve correctly.

Evaluate a pre-trained policy

Pre-trained Policies

Each directory contains a POLICY.zip file with pre-trained PPO weights (as reported in the paper). These are loaded automatically by policy_evaluate.py. To evaluate these pre-trained policies, run:

cd rc_ann        # or rc_spiking
python policy_evaluate.py

Train from scratch

Too train your own policy from scratch, run:

cd rc_ann        # or rc_spiking
python policy_train.py

Note that the current training setup has "num_procs": 120 in input_parameters.json meaning 120 elastica environments will launch during training. This will swamp most desktop CPUs and it is suggested you only run this many when on a server node with sufficient CPU cores. As a rule of thumb, set num_procs to the number of CPU cores on your machine.


Configuration (input_parameters.json)

Each directory contains an input_parameters.json file with hyperparameters:

Key Description
network_arch Architecture type ("RC")
seed Random seed
RC_network_dim Number of reservoir neurons
arm_dim Spatial dimension of the arm (3 for 3D)
num_ctrl_pts Number of muscle control points
target_traj_type Target trajectory type ("non_random")
speed_coefficient Target movement speed coefficient
E Young's modulus scaling (arm stiffness)
num_procs Number of parallel simulation instances
episode_length Length of each training episode (seconds)
policy_updates Total simulated training time (seconds)
RL_update_time RL policy update interval (seconds)
reservoir_density Reservoir connectivity density
reservoir_spectral_radius Reservoir spectral radius
input_density Input weight matrix density
n_epochs PPO training epochs per update
learning_rate PPO learning rate
minibatch_size PPO minibatch size
gae_lambda GAE lambda for advantage estimation
gamma Discount factor
clip_range PPO clip range
ent_coef Entropy coefficient

Dependencies

An older versions of stable-baselines3 was used during development. Key package versions to use this code are (see requirements.txt for full pinned versions):

Package Version Purpose
stable-baselines3 1.5.0 PPO reinforcement learning
torch 2.1.2 Deep learning backend
nengo 3.1.0 Spiking neural network framework
snntorch 0.9.1 Spiking neuron models
gym 0.21.0 RL environment interface
numpy 1.22.3 Numerical computing
numba 0.55.2 JIT compilation for elastica

Local versions of pyelastica and nengo-loihi are provided in this repo as minor custom changes were made to each to ensure compatability.


Citation

If you use this code, please cite:

@article{Naughton.2026,
author = {Noel Naughton  and Arman Tekinalp  and Keshav Shivam  and Seung Hyun Kim  and Apoorva Khairnar  and Volodymyr Kindratenko  and Mattia Gazzola },
title = {Neural reservoir control of a bio-hybrid soft arm},
journal = {Proceedings of the National Academy of Sciences},
volume = {123},
number = {17},
pages = {e2522094123},
year = {2026},
doi = {10.1073/pnas.2522094123},
URL = {https://www.pnas.org/doi/abs/10.1073/pnas.2522094123},
eprint = {https://www.pnas.org/doi/pdf/10.1073/pnas.2522094123},
}

License

MIT License — see LICENSE for details.

Code refactored for publication using the GitHub CoPilot CLI tool.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages