Mileva by shubhobm · Pull Request #14 · avidml/avidtools

shubhobm · 2025-12-07T06:42:00Z

Ingest AI CVEs from Mileva digests.

Includes changes to datamodels

Copilot

Pull request overview

This PR adds functionality to ingest CVE (Common Vulnerabilities and Exposures) data from Mileva AI research digests. The implementation scrapes CVE identifiers from Mileva.ai fortnightly digest pages, fetches detailed CVE information from the MITRE CVE API, and transforms the data into AVID Report objects.

Key changes:

New scripts/mileva.py script that scrapes Mileva.ai digests and creates AVID Reports from CVE data
Extended datamodels with CVSSScores and CWETaxonomy classes to support CVE vulnerability scoring
Updated Impact class to include CVSS and CWE taxonomy fields

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 23 comments.

File	Description
scripts/mileva.py	New script that scrapes CVE IDs from Mileva.ai digests, fetches CVE details from MITRE API, and converts them to AVID Report objects with async processing
avidtools/datamodels/components.py	Adds CVSSScores and CWETaxonomy classes for CVE metadata, extends Impact class to include CVSS and CWE fields

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

scripts/mileva.py

Copilot · 2025-12-07T06:50:39Z

scripts/mileva.py

+Script to scrape CVE information from Milev.ai and NVD.
+
+This script structures CVE data into AVID Vulnerability objects.
+
+This script:
+1. Scrapes unique CVE IDs from Milev.ai research digest pages
+2. Fetches detailed CVE information from NVD
+3. Structures the data into AVID Vulnerability objects
+4. Saves all vulnerabilities to a JSONL file
+
+Dependencies:
+    - beautifulsoup4: For HTML parsing
+    - requests: For HTTP requests
+    - nvdlib: For fetching CVE data from NVD (already in dependencies)


The documentation says "Fetches detailed CVE information from NVD" but the script actually fetches from the MITRE CVE API (cveawg.mitre.org), not NVD. This should be corrected to say "MITRE CVE API" or "CVE.org" instead of "NVD".

Suggested change

Script to scrape CVE information from Milev.ai and NVD.

This script structures CVE data into AVID Vulnerability objects.

This script:

1. Scrapes unique CVE IDs from Milev.ai research digest pages

2. Fetches detailed CVE information from NVD

3. Structures the data into AVID Vulnerability objects

4. Saves all vulnerabilities to a JSONL file

Dependencies:

- beautifulsoup4: For HTML parsing

- requests: For HTTP requests

- nvdlib: For fetching CVE data from NVD (already in dependencies)

Script to scrape CVE information from Milev.ai and the MITRE CVE API.

This script structures CVE data into AVID Vulnerability objects.

This script:

1. Scrapes unique CVE IDs from Milev.ai research digest pages

2. Fetches detailed CVE information from the MITRE CVE API (cveawg.mitre.org)

3. Structures the data into AVID Vulnerability objects

4. Saves all vulnerabilities to a JSONL file

Dependencies:

- beautifulsoup4: For HTML parsing

- requests: For HTTP requests

- aiohttp: For async HTTP requests to the MITRE CVE API

Copilot · 2025-12-07T06:50:40Z

avidtools/datamodels/components.py

+    scope: Optional[str] = None
+    confidentialityImpact: Optional[str] = None
+    integrityImpact: Optional[str] = None
+    availabilityImpact: Optional[str] = None


The field name availabilityImpact uses camelCase, which is inconsistent with Python naming conventions. All other fields in this codebase use snake_case. Consider using availability_impact and mapping it to the correct JSON field name using Pydantic's Field(alias="availabilityImpact") if needed.

Copilot · 2025-12-07T06:50:40Z

avidtools/datamodels/components.py

+class CWETaxonomy(BaseModel):
+    """CWE (Common Weakness Enumeration) taxonomy mapping."""
+
+    cweId: str


The field name cweId uses camelCase, which is inconsistent with Python naming conventions. All other fields in this codebase use snake_case. Consider using cwe_id and mapping it to the correct JSON field name using Pydantic's Field(alias="cweId") if needed.

scripts/mileva.py

Copilot · 2025-12-07T06:50:44Z

avidtools/datamodels/components.py

+    userInteraction: Optional[str] = None
+    scope: Optional[str] = None
+    confidentialityImpact: Optional[str] = None
+    integrityImpact: Optional[str] = None


The field name integrityImpact uses camelCase, which is inconsistent with Python naming conventions. All other fields in this codebase use snake_case. Consider using integrity_impact and mapping it to the correct JSON field name using Pydantic's Field(alias="integrityImpact") if needed.

scripts/mileva.py

Copilot · 2025-12-07T06:50:45Z

scripts/mileva.py

+1. Scrapes unique CVE IDs from Milev.ai research digest pages
+2. Fetches detailed CVE information from NVD
+3. Structures the data into AVID Vulnerability objects
+4. Saves all vulnerabilities to a JSONL file


This documentation line should say "Saves all reports to a JSONL file" instead of "vulnerabilities" to match the actual implementation.

Suggested change

4. Saves all vulnerabilities to a JSONL file

4. Saves all reports to a JSONL file

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

shubhobm added 5 commits December 2, 2025 14:31

wip

4576ec4

async

9fd22a3

//

87da8eb

only reports

7e3ae51

//

ecab70f

shubhobm requested a review from Copilot December 7, 2025 06:44

Copilot started reviewing on behalf of shubhobm December 7, 2025 06:44 View session

Copilot AI reviewed Dec 7, 2025

View reviewed changes

shubhobm and others added 2 commits December 7, 2025 12:24

Update scripts/mileva.py

8b631c7

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update scripts/mileva.py

b80b2a7

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

shubhobm merged commit a9c0958 into main Dec 7, 2025
1 check passed

shubhobm deleted the mileva branch December 7, 2025 06:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mileva#14

Mileva#14
shubhobm merged 7 commits intomainfrom
mileva

shubhobm commented Dec 7, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Copilot AI Dec 7, 2025

Uh oh!

Copilot AI Dec 7, 2025

Uh oh!

Copilot AI Dec 7, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI Dec 7, 2025

Uh oh!

Uh oh!

Copilot AI Dec 7, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	4. Saves all vulnerabilities to a JSONL file
	4. Saves all reports to a JSONL file

Conversation

shubhobm commented Dec 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Copilot AI Dec 7, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 7, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 7, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI Dec 7, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI Dec 7, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

shubhobm commented Dec 7, 2025 •

edited

Loading