refactor: align python code analysis contracts#398
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (8)
✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly refactors the code analysis pipeline to establish clear and stable data contracts between different agents and the workflow coordinator. The changes ensure that analysis results are consistently structured, metadata is accurately mapped, and the quality assessment logic is more robust. This leads to improved reliability and predictability of the code analysis process, making it easier to integrate and extend in the future. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
Vercel Preview
|
There was a problem hiding this comment.
Code Review
This pull request is a significant and well-executed refactoring that aligns the data contracts across the code analysis components. The introduction of stable schemas for analysis results, especially in failure cases, greatly improves the robustness of the pipeline. The expanded analysis capabilities in CodeAnalyzer and the improved logic in QualityAgent are excellent additions. The new contract tests provide good coverage for these critical changes.
I have one suggestion regarding a potential circular dependency in the fallback logic for calculating the maintainability score in QualityAgent, which could lead to incorrect scoring.
| overall = float(quality_metrics.get('overall_score', 0.0) or 0.0) | ||
| documentation = float(quality_metrics.get('documentation_score', 0.0) or 0.0) | ||
| complexity = float(quality_metrics.get('complexity_score', 0.0) or 0.0) | ||
| has_readme = 1.0 if quality_metrics.get('has_readme') else 0.0 | ||
| score = (overall * 0.5) + (documentation * 0.2) + (complexity * 0.2) + (has_readme * 0.1) | ||
| return min(1.0, max(0.0, score)) |
There was a problem hiding this comment.
The fallback calculation for the maintainability score appears to have a circular dependency. It uses overall_score from quality_analysis to compute a maintainability_score. However, the overall_score from the CodeAnalyzer is a composite score that already includes a maintainability_score component. This circular logic can lead to incorrect and unpredictable scoring.
The fallback should be based on more fundamental metrics. A better approach would be to use other pre-calculated scores like complexity_score and documentation_score, but without using overall_score.
| overall = float(quality_metrics.get('overall_score', 0.0) or 0.0) | |
| documentation = float(quality_metrics.get('documentation_score', 0.0) or 0.0) | |
| complexity = float(quality_metrics.get('complexity_score', 0.0) or 0.0) | |
| has_readme = 1.0 if quality_metrics.get('has_readme') else 0.0 | |
| score = (overall * 0.5) + (documentation * 0.2) + (complexity * 0.2) + (has_readme * 0.1) | |
| return min(1.0, max(0.0, score)) | |
| complexity_score = float(quality_metrics.get('complexity_score', 0.0) or 0.0) | |
| documentation_score = float(quality_metrics.get('documentation_score', 0.0) or 0.0) | |
| has_readme = 1.0 if quality_metrics.get('has_readme') else 0.0 | |
| # Re-weighted score based on available metrics, avoiding circular dependency on overall_score. | |
| score = (complexity_score * 0.6) + (documentation_score * 0.2) + (has_readme * 0.2) | |
| return min(1.0, max(0.0, score)) |
There was a problem hiding this comment.
Pull request overview
This PR tightens the “code analysis → workflow scoring → influence” data contract by standardizing fields like reproducibility_score, updated_at, and last_commit_date, and adding unit tests to enforce these expectations across the analyzer, agents, and coordinator.
Changes:
- Update
ScholarWorkflowCoordinatorto preferreproducibility_scorefor the code-stage score and pass richerCodeMetainto influence calculation. - Expand
CodeAnalyzeroutput contracts (structure/security/quality/dependencies) and improve fallback defaults for error cases. - Refactor
QualityAgentinput handling to support both nested analysis results and flat workflowcode_analysis_result, with new contract tests.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| tests/unit/test_workflow_coordinator.py | Adds regression tests ensuring code-stage uses reproducibility score and influence receives aligned CodeMeta. |
| tests/unit/test_code_analyzer_contracts.py | New contract test for CodeAnalyzer outputs across structure/security/quality/dependencies. |
| tests/unit/test_code_analysis_contracts.py | New contract tests for CodeAnalysisAgent flattening/placeholder behavior and QualityAgent input compatibility. |
| tests/test_code_analysis_fallback.py | Removes outdated fallback tests (superseded by new unit contract tests). |
| src/paperbot/utils/analyzer.py | Adds stable “empty” contracts, primary language detection, richer documentation/dependency parsing, and dependency security reporting. |
| src/paperbot/core/workflow_coordinator.py | Uses reproducibility_score for code-stage scoring when available; passes updated_at/last_commit_date/reproducibility_score into CodeMeta. |
| src/paperbot/agents/quality/agent.py | Refactors process() to normalize inputs and support flat/nested code analysis contracts. |
| src/paperbot/agents/code_analysis/agent.py | Preserves placeholder results in single-repo mode and aligns flattened metadata fields (updated_at, last_commit_date). |
Comments suppressed due to low confidence (1)
src/paperbot/utils/analyzer.py:26
radon/ComplexityVisitorare imported (and assignedNoneon ImportError) but no longer referenced anywhere in this module. Removing these unused imports/variables will reduce confusion and avoid future lint failures if strict linting is enabled.
try:
import radon.complexity as radon
from radon.visitors import ComplexityVisitor
except ImportError:
radon = None
ComplexityVisitor = None
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
| # 计算总体质量分数 | ||
| overall_score = 0.0 | ||
| recommendations = [] | ||
|
|
||
| # 复杂度评分 (0-1) | ||
| total_complexity = complexity.get('overall_complexity', 0) | ||
| if total_complexity > 0: | ||
| complexity_score = max(0, 1 - (total_complexity / 100)) | ||
| else: | ||
| complexity_score = 1.0 | ||
| file_count = max(1, len(complexity.get('file_complexity', {}))) | ||
| average_complexity = total_complexity / file_count | ||
| complexity_score = max(0.0, 1.0 - (average_complexity / 20.0)) |
|
There was a problem hiding this comment.
Pull request overview
This PR improves the “code analysis → workflow scoring → influence” contract by standardizing flattened code-analysis fields (reproducibility score + repo metadata) and adding tests to lock the behavior down across agents and the workflow coordinator.
Changes:
- Update workflow coordinator to publish the code-stage score from
reproducibility_score(fallback tohealth_score) and pass alignedCodeMetainto influence calculations. - Expand
CodeAnalyzeroutputs (stable empty contracts, primary language detection, dependency parsing, richer documentation metrics, and dependency security reporting). - Add/replace unit tests covering analyzer/agent contracts and workflow integration; remove legacy root-import test.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/unit/test_workflow_coordinator.py | Adds regression tests ensuring code-stage score uses reproducibility and influence receives aligned CodeMeta. |
| tests/unit/test_code_analyzer_contracts.py | New contract tests for CodeAnalyzer structure/security/quality/dependency outputs and requirements parsing. |
| tests/unit/test_code_analysis_contracts.py | New contract tests for CodeAnalysisAgent flattening/placeholder behavior and QualityAgent input compatibility. |
| tests/test_code_analysis_fallback.py | Removes legacy root-import tests. |
| src/paperbot/utils/analyzer.py | Enhances analyzer output stability, dependency parsing, doc metrics, and dependency security scanning behavior. |
| src/paperbot/core/workflow_coordinator.py | Uses reproducibility_score for code-stage scoring and builds richer CodeMeta for influence stage. |
| src/paperbot/agents/quality/agent.py | Refactors input normalization; supports flat code-analysis results and computes overall quality score aggregation. |
| src/paperbot/agents/code_analysis/agent.py | Preserves placeholders in single-repo mode and aligns flattened metadata fields (updated_at, last_commit_date). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
| # 计算总体质量分数 | ||
| overall_score = 0.0 | ||
| recommendations = [] | ||
|
|
||
| # 复杂度评分 (0-1) | ||
| total_complexity = complexity.get('overall_complexity', 0) | ||
| if total_complexity > 0: | ||
| complexity_score = max(0, 1 - (total_complexity / 100)) | ||
| else: | ||
| complexity_score = 1.0 | ||
| file_count = max(1, len(complexity.get('file_complexity', {}))) | ||
| average_complexity = total_complexity / file_count | ||
| complexity_score = max(0.0, 1.0 - (average_complexity / 20.0)) | ||
| overall_score += complexity_score * self.quality_weights['complexity'] |
| report['vulnerable_dependencies'] = vulnerabilities | ||
| report['total_vulnerabilities'] = len(vulnerabilities) | ||
| report['status'] = 'issues_found' if vulnerabilities else 'clean' | ||
| return report |



Summary
CodeAnalyzerhelpers so structure, dependency, security, and documentation analysis return stable schemas instead of collapsing to empty dictsCodeAnalysisAgent,QualityAgent, andScholarWorkflowCoordinatoraround the actual analysis payloads used in the repoupdated_at/last_commit_date/reproducibility_scoreintoCodeMeta, and remove a stale root-level fallback test that imported a dead moduleValidation
python -m pytest -q tests/unit/test_code_analyzer_contracts.py tests/unit/test_code_analysis_contracts.py tests/unit/test_workflow_coordinator.py