Skip to content

Mileva#14

Merged
shubhobm merged 7 commits intomainfrom
mileva
Dec 7, 2025
Merged

Mileva#14
shubhobm merged 7 commits intomainfrom
mileva

Conversation

@shubhobm
Copy link
Contributor

@shubhobm shubhobm commented Dec 7, 2025

Ingest AI CVEs from Mileva digests.

Includes changes to datamodels

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds functionality to ingest CVE (Common Vulnerabilities and Exposures) data from Mileva AI research digests. The implementation scrapes CVE identifiers from Mileva.ai fortnightly digest pages, fetches detailed CVE information from the MITRE CVE API, and transforms the data into AVID Report objects.

Key changes:

  • New scripts/mileva.py script that scrapes Mileva.ai digests and creates AVID Reports from CVE data
  • Extended datamodels with CVSSScores and CWETaxonomy classes to support CVE vulnerability scoring
  • Updated Impact class to include CVSS and CWE taxonomy fields

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 23 comments.

File Description
scripts/mileva.py New script that scrapes CVE IDs from Mileva.ai digests, fetches CVE details from MITRE API, and converts them to AVID Report objects with async processing
avidtools/datamodels/components.py Adds CVSSScores and CWETaxonomy classes for CVE metadata, extends Impact class to include CVSS and CWE fields

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 2 to 15
Script to scrape CVE information from Milev.ai and NVD.

This script structures CVE data into AVID Vulnerability objects.

This script:
1. Scrapes unique CVE IDs from Milev.ai research digest pages
2. Fetches detailed CVE information from NVD
3. Structures the data into AVID Vulnerability objects
4. Saves all vulnerabilities to a JSONL file

Dependencies:
- beautifulsoup4: For HTML parsing
- requests: For HTTP requests
- nvdlib: For fetching CVE data from NVD (already in dependencies)
Copy link

Copilot AI Dec 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation says "Fetches detailed CVE information from NVD" but the script actually fetches from the MITRE CVE API (cveawg.mitre.org), not NVD. This should be corrected to say "MITRE CVE API" or "CVE.org" instead of "NVD".

Suggested change
Script to scrape CVE information from Milev.ai and NVD.
This script structures CVE data into AVID Vulnerability objects.
This script:
1. Scrapes unique CVE IDs from Milev.ai research digest pages
2. Fetches detailed CVE information from NVD
3. Structures the data into AVID Vulnerability objects
4. Saves all vulnerabilities to a JSONL file
Dependencies:
- beautifulsoup4: For HTML parsing
- requests: For HTTP requests
- nvdlib: For fetching CVE data from NVD (already in dependencies)
Script to scrape CVE information from Milev.ai and the MITRE CVE API.
This script structures CVE data into AVID Vulnerability objects.
This script:
1. Scrapes unique CVE IDs from Milev.ai research digest pages
2. Fetches detailed CVE information from the MITRE CVE API (cveawg.mitre.org)
3. Structures the data into AVID Vulnerability objects
4. Saves all vulnerabilities to a JSONL file
Dependencies:
- beautifulsoup4: For HTML parsing
- requests: For HTTP requests
- aiohttp: For async HTTP requests to the MITRE CVE API

Copilot uses AI. Check for mistakes.
scope: Optional[str] = None
confidentialityImpact: Optional[str] = None
integrityImpact: Optional[str] = None
availabilityImpact: Optional[str] = None
Copy link

Copilot AI Dec 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The field name availabilityImpact uses camelCase, which is inconsistent with Python naming conventions. All other fields in this codebase use snake_case. Consider using availability_impact and mapping it to the correct JSON field name using Pydantic's Field(alias="availabilityImpact") if needed.

Copilot uses AI. Check for mistakes.
class CWETaxonomy(BaseModel):
"""CWE (Common Weakness Enumeration) taxonomy mapping."""

cweId: str
Copy link

Copilot AI Dec 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The field name cweId uses camelCase, which is inconsistent with Python naming conventions. All other fields in this codebase use snake_case. Consider using cwe_id and mapping it to the correct JSON field name using Pydantic's Field(alias="cweId") if needed.

Copilot uses AI. Check for mistakes.
userInteraction: Optional[str] = None
scope: Optional[str] = None
confidentialityImpact: Optional[str] = None
integrityImpact: Optional[str] = None
Copy link

Copilot AI Dec 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The field name integrityImpact uses camelCase, which is inconsistent with Python naming conventions. All other fields in this codebase use snake_case. Consider using integrity_impact and mapping it to the correct JSON field name using Pydantic's Field(alias="integrityImpact") if needed.

Copilot uses AI. Check for mistakes.
1. Scrapes unique CVE IDs from Milev.ai research digest pages
2. Fetches detailed CVE information from NVD
3. Structures the data into AVID Vulnerability objects
4. Saves all vulnerabilities to a JSONL file
Copy link

Copilot AI Dec 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This documentation line should say "Saves all reports to a JSONL file" instead of "vulnerabilities" to match the actual implementation.

Suggested change
4. Saves all vulnerabilities to a JSONL file
4. Saves all reports to a JSONL file

Copilot uses AI. Check for mistakes.
shubhobm and others added 2 commits December 7, 2025 12:24
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@shubhobm shubhobm merged commit a9c0958 into main Dec 7, 2025
1 check passed
@shubhobm shubhobm deleted the mileva branch December 7, 2025 06:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant