Skip to content

Bug: string.Template.substitute() crashes on $ in config values like regex file_pattern #2349

@Lewis-404

Description

@Lewis-404

Description

The config loader in graphrag-common uses string.Template.substitute() to process environment variables in settings.yaml. However, substitute() treats every $ character as a placeholder prefix — including $ that happens to be part of a regex pattern or any other string value. This causes a hard crash with an unhelpful ValueError.

Reproduction

Create a settings.yaml with:

input:
  type: markitdown
  file_pattern: ".*\\.md$"

Run graphrag index --root <dir>.

Error

ValueError: Invalid placeholder in string: line 50, col 25

This happens because $ at the end of .*\.md$ (a valid regex anchor) is treated by string.Template as a template placeholder, and a bare $ without a valid identifier causes substitute() to raise ValueError.

Root Cause

File: graphrag-common/graphrag_common/config/load_config.py, the _parse_env_variables function:

def _parse_env_variables(text: str) -> str:
    """Parse environment variables in the configuration text."""
    try:
        return Template(text).substitute(os.environ)
    except KeyError as error:
        msg = f"Environment variable not found: {error}"
        raise ConfigParsingError(msg) from error

substitute() raises ValueError for any invalid placeholder (not just KeyError for missing env vars). So bare $ in regex patterns, $VAR with special characters, etc. all crash the config loader.

This affects:

  • file_pattern with regex ending in $ (very common, e.g. .*\.md$)
  • Any other config value that happens to contain $
  • prompt paths containing $ characters

Proposed Fix

Replace substitute() with safe_substitute(), which silently leaves unrecognized placeholders as literal text instead of raising an error:

def _parse_env_variables(text: str) -> str:
    """Parse environment variables in the configuration text."""
    return Template(text).safe_substitute(os.environ)

safe_substitute() handles $$ escape sequences but does not raise an error on bare $ or missing keys — it leaves them unchanged as literal text. This is the expected behavior: a $ in a regex pattern should stay as $, not crash the config loader.

A KeyError or ValueError from substitute() is the wrong place to validate config values anyway — Pydantic schema validation in the config models already handles that properly.

Environment

  • graphrag version: 3.0.9
  • Python: 3.12

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions