Skip to content

Handle max tokens properly#20

Merged
jacsamell merged 1 commit into
mainfrom
cursorrules
Aug 11, 2025
Merged

Handle max tokens properly#20
jacsamell merged 1 commit into
mainfrom
cursorrules

Conversation

@jacsamell
Copy link
Copy Markdown
Owner

@jacsamell jacsamell commented Aug 11, 2025

PR Type

Enhancement, Configuration changes


Description

• Implement dynamic token budgeting for AI model calls
• Add auto-selection of default models based on API keys
• Enhanced handling of large PR diffs with chunking
• Update dependencies including LiteLLM and Pydantic versions


Changes walkthrough 📝

Relevant files
Enhancement
5 files
litellm_ai_handler.py
Dynamic token budgeting and extended thinking improvements
+45/-4   
token_handler.py
Include repository rules in token accounting                         
+9/-1     
config_loader.py
Auto-select default models based on available API keys     
+20/-0   
utils.py
Add token budgeting for cursor rules content                         
+25/-0   
pr_reviewer.py
Enhanced large PR handling with chunked processing             
+22/-7   
Configuration changes
1 files
configuration.toml
Update token limits and cursor rules configuration             
+10/-2   
Dependencies
1 files
requirements.txt
Bump LiteLLM, Pydantic and add httpx dependency                   
+4/-2     

Need help?
  • Type /help how to ... in the comments thread for any questions about PR-Agent usage.
  • Check out the documentation for more information.
  • @github-actions
    Copy link
    Copy Markdown
    Contributor

    github-actions Bot commented Aug 11, 2025

    PR Reviewer Guide 🔍

    (Review updated until commit 295b080)

    Here are some key observations to aid the review process:

    ⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
     AI Review Score: 75%
    🧪 No relevant tests
     Confidence score [1-100]: 78
     Complexity score [1-10]: 7
     Security score [1-10]: 8
     Auto approve recommendation: false
     Auto approve reasoning: This PR involves complex AI model integration changes with dynamic token budgeting, chunked processing for large diffs, and auto-model selection. The complexity and lack of visible tests, combined with potential error handling issues, require human review to ensure correctness and stability.
     Requires human approval: Complex Logic
    💡 Code suggestions

    Method Extraction

    The complex token budgeting logic for cursor rules should be extracted into a separate method to improve readability and testability.

    Existing code:

    # Optionally clip rules to a max token budget to avoid exceeding context
    try:
        from pr_agent.algo.token_handler import TokenHandler
        from pr_agent.algo.utils import get_max_tokens
        model = get_settings().config.model
        token_handler = TokenHandler()
        rules_tokens = token_handler.count_tokens(rules_content)
    
        max_rules_tokens = int(get_settings().config.get('max_cursor_rules_tokens', 20000))
        hard_cap_ratio = float(get_settings().config.get('cursor_rules_context_ratio', 0.25))
        model_ctx = get_max_tokens(model)
        hard_cap_tokens = max(2000, int(model_ctx * hard_cap_ratio))
        allowed_tokens = min(max_rules_tokens, hard_cap_tokens)
    
        if rules_tokens > allowed_tokens:
            from pr_agent.algo.utils import clip_tokens
            clipped = clip_tokens(rules_content, allowed_tokens, add_three_dots=True)
            get_logger().warning(
                f"Cursor rules too large ({rules_tokens} tokens). Clipped to {allowed_tokens} tokens for prompting."
            )
            rules_content = clipped
    except Exception as e:
        get_logger().debug(f"Failed to apply cursor rules token budget: {e}")
    

    Improved code:

    rules_content = self._apply_cursor_rules_token_budget(rules_content)
    
    def _apply_cursor_rules_token_budget(self, rules_content: str) -> str:
        """Apply token budgeting to cursor rules content."""
        try:
            from pr_agent.algo.token_handler import TokenHandler
            from pr_agent.algo.utils import get_max_tokens
            model = get_settings().config.model
            token_handler = TokenHandler()
            rules_tokens = token_handler.count_tokens(rules_content)
    
            max_rules_tokens = int(get_settings().config.get('max_cursor_rules_tokens', 20000))
            hard_cap_ratio = float(get_settings().config.get('cursor_rules_context_ratio', 0.25))
            model_ctx = get_max_tokens(model)
            hard_cap_tokens = max(2000, int(model_ctx * hard_cap_ratio))
            allowed_tokens = min(max_rules_tokens, hard_cap_tokens)
    
            if rules_tokens > allowed_tokens:
                from pr_agent.algo.utils import clip_tokens
                clipped = clip_tokens(rules_content, allowed_tokens, add_three_dots=True)
                get_logger().warning(
                    f"Cursor rules too large ({rules_tokens} tokens). Clipped to {allowed_tokens} tokens for prompting."
                )
                return clipped
            return rules_content
        except Exception as e:
            get_logger().debug(f"Failed to apply cursor rules token budget: {e}")
            return rules_content
    
    🔒 No security concerns identified
    ⚡ Recommended focus areas for review

    Silent Failures

    The auto-model selection logic uses a bare try-except that silently passes on any exception. This could hide important configuration errors that should be logged or handled properly.

    try:
        # Only set defaults if not explicitly set by repo/global settings
        configured_model = get_settings().get('config.model', '').strip()
        if not configured_model:
            openai_key = get_settings().get('openai.key') or os.environ.get('OPENAI_API_KEY') or os.environ.get('OPENAI__KEY')
            anthropic_key = get_settings().get('anthropic.key') or os.environ.get('ANTHROPIC_API_KEY') or os.environ.get('ANTHROPIC__KEY')
    
            if openai_key:
                # Prefer GPT-5 if OpenAI key is present
                get_settings().set('config.model', 'gpt-5')
                get_settings().set('config.fallback_models', ['anthropic/claude-sonnet-4-20250514'] if anthropic_key else [])
            elif anthropic_key:
                # Fall back to Claude Sonnet 4 if only Anthropic is present
                get_settings().set('config.model', 'anthropic/claude-sonnet-4-20250514')
            # else: leave as-is, user must set
    except Exception:
        # Silent: don’t block initialization if auto-detect fails
        pass
    Error Suppression

    The dynamic max_tokens calculation catches all exceptions and only logs at debug level, which could hide important errors that affect token budgeting functionality.

    except Exception as e:
        get_logger().debug(f"Unable to set dynamic max_tokens: {e}")
    Context Loss

    The chunked processing concatenates diff chunks with simple separators which may lose important context between code sections, potentially affecting review quality.

    self.patches_diff = "\n\n---\n\n".join(chunks)

    🔍 Manual review required

    Reason: Manual review required: Complex Logic

    @jacsamell jacsamell merged commit e6f687d into main Aug 11, 2025
    2 checks passed
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

    Projects

    None yet

    Development

    Successfully merging this pull request may close these issues.

    1 participant