This repository is for testing methods to reduce LLM context occupied by MCP tool descriptions.
First, install pre-commit hooks with:
uv run pre-commit installThis will make it so you cannot make a git commit if it does not satisfy conditions from .pre-commit-config.yaml. Namely:
- Large files added
- Bad formatting
- Ruff's linter fails
- Pyrefly's type checking fails
Next, you need to add LLM credentials. For that, copy the .env.example file
to .env:
cp .env.example .envand fill out all the variables.
Finally, run the benchmark with:
uv run --env-file .env benchmark.pyIf you want to chat with the agent, run the web client via:
uv run --env-file .env uvicorn partial_mcp.web:app --port 8000First, you need to install the public Spider Database (1.8 GB).
You can either download it manually from this link, in that case, remember to place it into the data.
Or you can run the following script in the data directory (curl didn't work for me, cause it's Google Drive):
pip install gdown
gdown https://drive.google.com/uc?id=1403EGqzIDoHMdQF4c9Bkyl7dZLZ5Wt6JThen unzip the installed archive:
unzip spider_data.zipYou can remove the archive to save some space:
rm spider_data.zipBefore launching the text2sql_tool which is already in this repo, you need to set LLM credentials in the .env file, namely LLM_BASE_URL and LLM_API_KEY.
Dev note: It's a working MCP tool, but proper integration with the PydanticAI Agent isn't done yet. For now, it will be called as a normal Python function.
The baseline tool will be available for launch with the spider_benchmark.py file. Instructions on how to run your tool instead will be provided shortly.
This repo is using the Spider 1.0 benchmark. A migration to UNITE or Spider 2.0 will be possible later, in case we want a more robust benchmark, the APIs are mostly similar.