ClimSight is an advanced tool that integrates Large Language Models (LLMs) with climate data to provide localized climate insights for decision-making. ClimSight transforms complex climate data into actionable insights for agriculture, urban planning, disaster management, and policy development.
The target audience includes researchers, providers of climate services, policymakers, agricultural planners, urban developers, and other stakeholders who require detailed climate information to support decision-making. ClimSight is designed to democratize access to climate data, empowering users with insights relevant to their specific contexts.
ClimSight distinguishes itself through several key advancements:
- Integration of LLMs: ClimSight leverages state-of-the-art LLMs to interpret complex climate-related queries, synthesizing information from diverse data sources.
- Multi-Source Data Integration: Unlike conventional systems that rely solely on structured climate data, ClimSight integrates information from multiple sources.
- Evidence-Based Approach: ClimSight ensures contextually accurate answers by retrieving relevant knowledge from scientific reports, IPCC documents, and geographical databases.
- Modular Architecture: Specialized components handle distinct tasks, such as data retrieval, contextual understanding, and result synthesis, leading to more accurate outputs.
- Real-World Applications: ClimSight is validated through practical examples, such as assessing climate risks for specific agricultural activities and urban planning scenarios.
This is the recommended installation method to get the latest features and updates.
# Clone the repository
git clone https://github.com/CliDyn/climsight.git
cd climsight
# Create and activate the environment
mamba env create -f environment.yml
conda activate climsight
# Download required data
python download_data.py
# Optional: download DestinE data (large ~12 GB, not downloaded by default)
python download_data.py DestinE# Clone the repository
git clone https://github.com/CliDyn/climsight.git
cd climsight
# Create and activate a virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Download required data
python download_data.py
# Optional: download DestinE data (large ~12 GB, not downloaded by default)
python download_data.py DestinEThe Docker container provides a stable release (v1.0.0) of ClimSight. For the latest features, please install from source as described above.
# Make sure your OpenAI API key is set as an environment variable
export OPENAI_API_KEY="your-api-key-here"
# Pull and run the container
docker pull koldunovn/climsight:stable
docker run -p 8501:8501 -e OPENAI_API_KEY=$OPENAI_API_KEY koldunovn/climsight:stableThen open http://localhost:8501/ in your browser.
The PyPI package provides a stable release (v1.0.0) of ClimSight. For the latest features, please install from source as described above.
pip install climsightClimSight will automatically use a config.yml file from the current directory. You can modify this file to customize settings:
# Key settings you can modify in config.yml:
# - LLM model (gpt-4, gpt-5, ...)
# - Climate data sources
# - RAG database configuration
# - Agent parameters
# - ERA5 data retrieval settingsClimSight requires an OpenAI API key for LLM functionality. You can set it as an environment variable:
export OPENAI_API_KEY="your-api-key-here"Alternatively, you can enter your API key directly in the browser interface when prompted.
If you want to use ERA5 time series data retrieval (enabled via the "Enable ERA5 data" toggle in the UI), you need an Arraylake API key from Earthmover. This allows downloading ERA5 reanalysis data for detailed historical climate analysis.
export ARRAYLAKE_API_KEY="your-arraylake-api-key-here"You can also enter the Arraylake API key in the browser interface when the ERA5 data option is enabled.
# Run from the repository root
streamlit run src/climsight/climsight.pyThe application will open in your browser automatically. Just type your climate-related questions and press "Generate" to get insights.
For batch processing of climate questions, the sequential directory contains specialized tools for generating, validating, and processing questions in bulk. These tools are particularly useful for research and analysis requiring multiple climate queries. See the sequential/README.md for detailed usage instructions.
If you use or refer to ClimSight in your work, please cite:
Kuznetsov, I., Jost, A.A., Pantiukhin, D. et al. Transforming climate services with LLMs and multi-source data integration. npj Clim. Action 4, 97 (2025). https://doi.org/10.1038/s44168-025-00300-y
Koldunov, N., Jung, T. Local climate services for all, courtesy of large language models. Commun Earth Environ 5, 13 (2024). https://doi.org/10.1038/s43247-023-01199-1
