refactor: prune orphaned utility variants#399
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Caution Review failedPull request was closed or merged during review 📝 WalkthroughWalkthroughThis change removes multiple redundant and backup utility modules for paper downloading, conference parsing, and keyword optimization. A new test file enforces that removed utilities are not imported elsewhere and validates naming conventions for remaining utility modules. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~35 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Vercel Preview
|
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly refactors the Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Pull request overview
This PR cleans up src/paperbot/utils by removing unused/orphaned utility modules and adding a small contract test to prevent backup-style or invalid utility filenames from creeping back into the repo.
Changes:
- Deleted multiple orphaned/variant utility modules under
src/paperbot/utils. - Updated
paperbot.utils.__init__docstring to describe the remaining public surface. - Added a unit “cleanup contract” test to assert removed files stay removed and to flag invalid utility filenames.
Reviewed changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/unit/test_utils_cleanup_contracts.py | Adds contract tests to ensure removed utility variants stay deleted and to prevent invalid utility filenames. |
| src/paperbot/utils/init.py | Updates module docstring describing the utils surface. |
| src/paperbot/utils/smart_downloader.py | Removes an orphaned smart download manager utility. |
| src/paperbot/utils/keyword_optimizer.py | Removes an unused keyword/query optimization utility. |
| src/paperbot/utils/experiment_runner.py | Removes an unused experiment runner helper. |
| src/paperbot/utils/experiment_metrics.py | Removes an unused lightweight metrics helper. |
| src/paperbot/utils/downloader_ccs.py | Removes a CCS-specific downloader variant. |
| src/paperbot/utils/downloader_back.py | Removes a backup-style downloader variant. |
| src/paperbot/utils/conference_parsers_new.py | Removes an unused “new” conference parser variant. |
| src/paperbot/utils/conference_parsers.py | Removes an unused conference parser module. |
| src/paperbot/utils/conference_helpers.py | Removes an unused conference helper module. |
| src/paperbot/utils/conference_downloader.py | Removes an unused conference downloader base class. |
| src/paperbot/utils/acm_extractor.py | Removes an unused ACM extractor helper. |
| src/paperbot/utils/CCS-DOWN.py | Removes an invalidly-named utility variant file. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
| name = path.name | ||
| if " " in name or name != name.lower(): | ||
| invalid.append(name) | ||
| if name.endswith("_back.py") or name.endswith("_new.py"): | ||
| invalid.append(name) | ||
|
|
| 暴露当前仍在主代码路径中使用的通用工具: | ||
| - logger: 日志配置 | ||
| - downloader: 论文下载器 | ||
| - retry_helper: 重试机制 | ||
| - json_parser: JSON 解析 | ||
| - text_processing: 文本处理 |
There was a problem hiding this comment.
Code Review
This pull request is a substantial refactoring that removes a large number of orphaned and backup utility files, which is a great improvement for maintainability. The addition of a contract test to prevent similar issues in the future is also an excellent practice. However, I've identified a critical issue where the deletion of downloader variants has broken the main downloader.py by removing a method it depends on. Please see the specific comment for details.
|
|
||
|
|
||
|
|
||
| async def _parse_ccs_papers(self, base_url: str, year: str) -> List[Dict[str, Any]]: |
There was a problem hiding this comment.
The deletion of this method (and its variants in other removed files) breaks the functionality of src/paperbot/utils/downloader.py.
The get_conference_papers method in downloader.py still contains a call to self._parse_ccs_papers for the 'ccs' conference type. This will now raise an AttributeError at runtime.
To resolve this, please consider one of the following:
- Merge the implementation of
_parse_ccs_papersfrom one of the deleted files into thePaperDownloaderclass indownloader.py. - Remove the logic for handling the 'ccs' conference from
get_conference_papersif it is no longer supported.
This is a critical issue as it breaks existing functionality.
|



Summary
src/paperbot/utilsthat have no surviving import chain in the repoCCS-DOWN.py,downloader - ccs.py,*_back.py,*_new.py) and several conference / experiment helpers that were only referenced by those dead variants or not referenced at allpaperbot.utils.__init__docs to reflect the utilities that are still part of the live surfaceValidation
python -m pytest -q tests/unit/test_utils_cleanup_contracts.pypython -m pytest -q tests/unit/test_retry_helper_async.py tests/unit/test_api_security_middleware.pyrgcheck confirms no remaining imports reference the removed utility modulesNotes
tests/test_conference_agent_stats.pyis already stale onorigin/devand still imports a dead top-levelagents.*module; it was not changed in this PR.Summary by CodeRabbit
Revert
Tests