From 0deb4b73b6db6ab4ec91bf3ac88411f8ffa14523 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Louis-F=C3=A9lix=20Nothias?= Date: Tue, 31 Mar 2026 15:36:11 +0200 Subject: [PATCH 1/4] Update co-authorship list in licensing notes --- docs/licensing-notes.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/licensing-notes.md b/docs/licensing-notes.md index 06b58f4..85e7fda 100644 --- a/docs/licensing-notes.md +++ b/docs/licensing-notes.md @@ -25,6 +25,6 @@ This wording is intended for repository governance documentation and remains sub ## Co-authorship context (informational) -Current named research co-authors: Martin Legrand, Tao Jiang, Matthieu Feraud, Benjamin Navet, and Louis-Felix Nothias. +Current named manuscript co-authors: Martin Legrand, Tao Jiang, Matthieu Feraud, Benjamin Navet, Yousouf Taghzouti, Fabien Gandon, Elise Dumont, and Louis-Felix Nothias. This co-authorship note is informational and does not itself determine legal ownership or licensing authority. From 33d74c0fb684c5b64673d451871b1b3c27f83500 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Louis-F=C3=A9lix=20Nothias?= Date: Tue, 31 Mar 2026 15:37:02 +0200 Subject: [PATCH 2/4] Update README.md to enhance project description and clarify usage instructions --- README.md | 148 ++++++++++++++++++++++++++++++++++-------------------- 1 file changed, 93 insertions(+), 55 deletions(-) diff --git a/README.md b/README.md index 893a9be..cbfa5e2 100644 --- a/README.md +++ b/README.md @@ -1,7 +1,7 @@

Toolomics

- A suite of MCP-based Tools from the HolobiomicsLab. Used by AI-Agents such as **Mimosa-AI** + Companion platform for MCP server management and workspace-isolated scientific tool execution for Mimosa and other MCP-compatible agents.

@@ -21,52 +21,73 @@ --- -> ***Toolomics*** — deploys containerized tools, manages isolated instances, and enables file sharing across AI agents for bioinformatics, metabolomics, molecular docking, and beyond. +> ***Toolomics*** exposes computational tools as discoverable MCP services, manages isolated multi-instance workspaces, and lets agents share files across scientific workflows. -**Use cases:** -- Deploy MCP servers for browser automation, PDF processing, and data extraction -- Run isolated, multi-instance agent workspaces with automatic resource management -- Orchestrate containerized bioinformatics pipelines (XCMS, RStudio, Redis) with zero config +Toolomics is the companion MCP server management project described alongside Mimosa in [`main_arxiv.tex`](main_arxiv.tex). In this repository, that means: +- discovering MCP services from `server.py` and `docker-compose.yml` definitions under `mcp_host/` +- assigning ports and recording them in instance-specific `config_.json` files +- isolating workspaces, Docker projects, volumes, and auxiliary services per deployment instance +- making files created by one MCP server immediately available to other MCP servers through a shared workspace -## Install & deploy tools +## Quick Start -### Deploy all tools automatically +Run Toolomics against a workspace and a port range: ```bash ./start.sh ``` -### Deploy using python script +Example: -***Not recommanded, start.sh will handle python, requirements and workpsace installation automatically.*** +```bash +./start.sh 5000 5099 workspace_mimosa +``` + +On first run, Toolomics will: +1. check Python and `pip` +2. optionally install `requirements.txt` +3. create or reuse the requested workspace +4. derive an instance ID from the workspace path +5. create or update `config_.json` with discovered services and assigned ports -First, install the required dependencies, you can use either pip or the faster UV package manager: +Newly discovered services are added with `"enabled": false` by default. Enable the MCP servers you want in the generated config file, then rerun `./start.sh`. -**1. Install dependencies:** +### Manual Deployment + +If you prefer to run the deployment script directly: + +**1. Install dependencies** ```bash python3.10 -m pip install -r requirements.txt -# or using UV +# or uv pip install -r requirements.txt ``` -**2. Run script:** +**2. Run the deployment manager** ```bash -python3.10 deploy.py --config config.json --workspace --host_port_min --host_port_max +python3.10 deploy.py --config config.json --mcp-dir mcp_host --workspace --host_port_min --host_port_max ``` -## Centralized File Management +Passing `--config config.json` is supported, but `deploy.py` will automatically expand it to an instance-specific file such as `config_86517947.json` based on the workspace path. + +## Centralized Workspace -All MCP servers execute in a centralized **workspace directory** (default: `workspace/`). This means: +All MCP servers execute against a centralized workspace directory (default: `workspace/`). This means: -- **Browser MCP** downloads files → `workspace/downloaded_file.pdf` -- **PDF MCP** processes files → `workspace/extracted_text.txt` -- **Any MCP** creates files → `workspace/output_file.json` +- Browser MCP downloads files to the workspace +- PDF MCP processes files already present in the workspace +- Other MCP servers can consume the same files without copying them between tool-specific directories + +Example paths: +- `workspace/downloaded_file.pdf` +- `workspace/extracted_text.txt` +- `workspace/output_file.json` This centralized approach ensures that AI agents can easily find and work with files across different MCP tools without needing to track file locations. ## Multi-Instance Deployment -Toolomics supports running **multiple independent instances simultaneously**, each with its own workspace and Docker service isolation. +Toolomics supports running multiple independent instances simultaneously, each with its own workspace and Docker service isolation. ### How It Works @@ -76,14 +97,14 @@ Each instance is automatically assigned a unique **instance ID** (8-character ha This means each instance has its own configuration and doesn't interfere with others. -**Example: Deploy two instances concurrently** +**Example: deploy two instances concurrently** ```bash # Terminal 1: Instance for user Martin -start.sh 5000 5100 workspace_martin +./start.sh 5000 5099 workspace_martin # Terminal 2: Instance for user John (simultaneous) -start.sh 5100 5200 workspace_john +./start.sh 5100 5199 workspace_john ``` ### Automatic Resource Isolation @@ -95,31 +116,49 @@ Each instance automatically gets isolated resources: | **Workspace** | Separate directory (`workspace_martin/`, `workspace_john/`) | | **Docker Containers** | Suffixed with instance ID (`xcmsrocker_a3f2b1c9`, `xcmsrocker_f7e2d4a1`) | | **Data Volumes** | Instance-specific names (`rstudio_data_a3f2b1c9`, `redis-data_f7e2d4a1`) | -| **MCP Server Ports** | Different port ranges (5000-5100 vs 5100-5200) | +| **MCP Server Ports** | Different port ranges (5000-5099 vs 5100-5199) | | **Auxiliary Ports** | Dynamic allocation (8787→9537, 8080→9037, etc.) | -## Using MCP with Your Client +This multi-tenant, workspace-isolated design is the same property referenced in the manuscript when Toolomics is described as the companion discovery and execution layer for Mimosa. + +## Discovering And Using MCP Services -To interact with the tools using a client (e.g., for your AI agent), you can use the `fastmcp` library. +To interact with the tools using a client such as Mimosa or another MCP-compatible agent, you can use the generated config file directly or scan a predefined local port range. ### Finding the MCP Port -Each MCP server is assigned a port, which is recorded in the `config.json` file. For example: +Each MCP server is assigned a port, which is recorded in the instance-specific config file. For example: ```json [ - { - "mcp_host/browser/server.py": 5002 - }, - { - "mcp_host/Rscript/server.py": 5001 - }, - { - "mcp_host/files/csv/server.py": 5101 - } + { + "path": "mcp_host/pdf/server.py", + "port": 5002, + "enabled": true + }, + { + "path": "mcp_host/image_analysis/server.py", + "port": 5006, + "enabled": true + }, + { + "path": "mcp_host/shell/docker-compose.yml", + "port": 5012, + "enabled": true + } ] ``` +### Scanning A Predefined Port Range + +Toolomics includes a helper script that scans `localhost:5000-5200` and enumerates active MCP tools: + +```bash +python3 discover_mcp.py +``` + +This mirrors the local port-range discovery pattern described in the manuscript for Mimosa's tool discovery layer. + ### Example Client Code Here is an example of how to use a client to interact with an MCP server running on port `5002`: @@ -146,17 +185,16 @@ async def main(): # Other MCP tools can access it from the same location ``` -## Adding a New MCP +## Adding A New MCP You can easily add a new tool as an MCP server. ### Steps to Add a New MCP -1. Create a `server.py` file with your MCP implementation, it should take the port number as first argument (eg: `server.py 5003`). -2. Place the file in a subfolder of the `mcp_host` directory. For example, to add a metabolomics-related tool, create a subfolder like `mcp_host/your_tool_name`. - -The `deploy.py` script will look for new `server.py` file, attribute a port for your script and add it to `config.json` (unless you manually did by modifying the config.json), finally it will run your script with the assigned port as first argument. - +1. Create a `server.py` file with your MCP implementation. It should accept the assigned port as either an environment variable or the first command-line argument. +2. Place the file in a subfolder of `mcp_host/`, for example `mcp_host/your_tool_name/server.py`. +3. Run `./start.sh` or `deploy.py` to let Toolomics discover the service and assign it a port. +4. Set `"enabled": true` for the new service in the generated `config_.json`, then rerun deployment. ### Example MCP Implementation @@ -165,16 +203,17 @@ The `fastmcp` library simplifies the creation of MCP servers. Here's a basic exa ```python #!/usr/bin/env python3 -from fastmcp import FastMCP +import os +import sys from pathlib import Path +from fastmcp import FastMCP + project_root = Path(__file__).resolve().parent.parent.parent -sys.path.append(str(project_root)) # Add 'a/' to Python's search path +sys.path.append(str(project_root)) from shared import CommandResult, run_bash_subprocess, return_as_dict -description = """ -a calculator that ... -""" +description = "A calculator MCP." mcp = FastMCP( name="calculator", @@ -200,13 +239,9 @@ if __name__ == "__main__": mcp.run(transport="streamable-http", port=port, host="0.0.0.0") ``` -### Automatic Port Assignment - -When you run the `start.sh` or `deploy.py` script for the first time, it will automatically assign a port to your new MCP server and save the mapping in the `config.json` file. - ## Dockerizing an MCP Server -For MCP servers that require isolated dependencies or need to run in a containerized environment (e.g., for ML models, system tools, or heavy dependencies), you can deploy them using Docker. +For MCP servers that require isolated dependencies or need to run in a containerized environment, you can deploy them with Docker. This is one of the main ways Toolomics keeps tool dependencies isolated across concurrent scientific workflows. ### How It Works @@ -214,6 +249,7 @@ If a `docker-compose.yml` file exists in the same directory as your `server.py`, - **Automatically deploy the server in Docker** instead of running it directly on the host - **Skip the standalone Python execution** to avoid duplicate deployments - **Pass the assigned port** to the Docker container via the `MCP_PORT` environment variable +- **Pass instance isolation metadata** such as `INSTANCE_ID` and `WORKSPACE_PATH` to the container ### Steps to Dockerize an MCP @@ -276,8 +312,9 @@ services: - "${MCP_PORT}:${MCP_PORT}" environment: - MCP_PORT=${MCP_PORT} + - INSTANCE_ID=${INSTANCE_ID} volumes: - - ../../workspace:/workspace + - ../../${WORKSPACE_PATH}:/app/workspace:rw ``` **Important**: The build context must be set to the project root (`../..`) to allow the Dockerfile to access `shared.py` and other project files. @@ -286,8 +323,9 @@ services: The deployment script will automatically: - Detect the `docker-compose.yml` -- Assign a port (5000-5099 range for mcp_host) +- Assign a port in the selected host range - Set the `MCP_PORT` environment variable +- Set `INSTANCE_ID` and `WORKSPACE_PATH` - Build and start the Docker container - Skip running `server.py` directly on the host From eaa2c0c276e699dc960da18b4f22398202a42304 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Louis-F=C3=A9lix=20Nothias?= Date: Tue, 31 Mar 2026 15:37:51 +0200 Subject: [PATCH 3/4] Enhance error handling in deployment script and update MCP server discovery documentation --- deploy.py | 28 ++++++++++++++++++++-------- discover_mcp.py | 4 ++-- start.sh | 13 +++++++++++++ 3 files changed, 35 insertions(+), 10 deletions(-) diff --git a/deploy.py b/deploy.py index ee3cd78..90c4a7a 100644 --- a/deploy.py +++ b/deploy.py @@ -510,7 +510,7 @@ def load_config(self) -> Dict[str, dict]: 'enabled': item.get('enabled', True) # Default to enabled } else: - raise ValueError("Can't parse config.json file") + raise ValueError(f"Can't parse config file: {self.config_path}") logger.info(f"Successfully loaded {len(config_dict)} items from config (enabled: {sum(1 for v in config_dict.values() if v.get('enabled'))})") return config_dict @@ -620,7 +620,10 @@ def assign_ports(self, server_files: List[Path], compose_files: List[Path] = Non config[server_str] = {'port': next_host_port, 'enabled': False} used_ports.add(next_host_port) - logger.info(f"Assigned host port {next_host_port} to {server_str} (disabled - edit config to enable)") + logger.info( + f"Assigned host port {next_host_port} to {server_str} " + f"(disabled - edit {self.config_path} to enable)" + ) next_host_port += 1 # Assign ports to docker-compose files @@ -638,7 +641,10 @@ def assign_ports(self, server_files: List[Path], compose_files: List[Path] = Non config[compose_str] = {'port': next_host_port, 'enabled': False} used_ports.add(next_host_port) - logger.info(f"Assigned host port {next_host_port} to {compose_str} (disabled - edit config to enable)") + logger.info( + f"Assigned host port {next_host_port} to {compose_str} " + f"(disabled - edit {self.config_path} to enable)" + ) next_host_port += 1 self.save_config(config) @@ -751,7 +757,7 @@ def _deploy_docker_services(self, compose_files: List[Path], port_config: Dict[s # Save config if any ports were reassigned if config_updated: - logger.info("Updating config.json with new port assignments") + logger.info(f"Updating {self.config_manager.config_path} with new port assignments") self.config_manager.save_config(port_config) if started_count > 0: @@ -759,7 +765,10 @@ def _deploy_docker_services(self, compose_files: List[Path], port_config: Dict[s logger.info("Waiting for Docker services to start...") time.sleep(3) elif disabled_count > 0: - logger.info(f"⚠️ All {disabled_count} Docker services are disabled. Change config.json to enable.") + logger.info( + f"⚠️ All {disabled_count} Docker services are disabled. " + f"Change {self.config_manager.config_path} to enable them." + ) def _deploy_mcp_servers(self, server_files: List[Path], port_config: Dict[str, dict], host_port_min: int = HOST_PORT_MIN, host_port_max: int = HOST_PORT_MAX): @@ -811,12 +820,15 @@ def _deploy_mcp_servers(self, server_files: List[Path], port_config: Dict[str, d # Save config if any ports were reassigned if config_updated: - logger.info("Updating config.json with new port assignments") + logger.info(f"Updating {self.config_manager.config_path} with new port assignments") self.config_manager.save_config(port_config) logger.info(f"Started {started_count} MCP servers ({disabled_count} disabled)") if started_count == 0: - raise Exception("⚠️ No MCP server enabled, change config.json and select MCP servers to enable.") + raise Exception( + f"⚠️ No MCP server enabled. Change {self.config_manager.config_path} " + "and select MCP servers to enable." + ) def main(): parser = argparse.ArgumentParser(description="Deploy MCP servers with centralized workspace file management") @@ -852,4 +864,4 @@ def main(): sys.exit(1) if __name__ == "__main__": - main() \ No newline at end of file + main() diff --git a/discover_mcp.py b/discover_mcp.py index d4c4c0a..f829faa 100644 --- a/discover_mcp.py +++ b/discover_mcp.py @@ -7,7 +7,7 @@ from fastmcp import Client async def discover_mcp_servers(): - """Discover MCP servers on ports 5000-5050 and list their tools.""" + """Discover MCP servers on ports 5000-5200 and list their tools.""" print("🔍 Discovering MCP servers on ports 5000-5200...") for port in range(5000, 5201): @@ -27,4 +27,4 @@ async def discover_mcp_servers(): if __name__ == "__main__": print("🧪 Starting MCP Server Discovery") - asyncio.run(discover_mcp_servers()) \ No newline at end of file + asyncio.run(discover_mcp_servers()) diff --git a/start.sh b/start.sh index fc3e468..634efe2 100755 --- a/start.sh +++ b/start.sh @@ -194,6 +194,19 @@ echo "Deploying MCP servers..." $PYTHON deploy.py --config config.json --mcp-dir mcp_host --host_port_min "$START_PORT" --host_port_max "$END_PORT" --workspace $WORKSPACE & HOST_PID=$! wait $HOST_PID +DEPLOY_EXIT_CODE=$? + +echo "" +if [ $DEPLOY_EXIT_CODE -ne 0 ]; then + echo "=== DEPLOYMENT FAILED ===" + echo "deploy.py exited with status $DEPLOY_EXIT_CODE" + echo "" + echo "Instance-specific config file: $INSTANCE_CONFIG" + echo "If this was a first run, enable the MCP services you want in that file" + echo "and rerun this command." + echo "" + exit $DEPLOY_EXIT_CODE +fi # After deployment, show the config file location echo "" From beb77512757b095d9e45b4e0193e7b891590da26 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Louis-F=C3=A9lix=20Nothias?= Date: Tue, 31 Mar 2026 16:13:39 +0200 Subject: [PATCH 4/4] Update README.md to clarify Toolomics project description and its functionalities --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index cbfa5e2..68f7ea6 100644 --- a/README.md +++ b/README.md @@ -23,7 +23,7 @@ > ***Toolomics*** exposes computational tools as discoverable MCP services, manages isolated multi-instance workspaces, and lets agents share files across scientific workflows. -Toolomics is the companion MCP server management project described alongside Mimosa in [`main_arxiv.tex`](main_arxiv.tex). In this repository, that means: +In this repository, Toolomics: - discovering MCP services from `server.py` and `docker-compose.yml` definitions under `mcp_host/` - assigning ports and recording them in instance-specific `config_.json` files - isolating workspaces, Docker projects, volumes, and auxiliary services per deployment instance