Refactor data collection into lifetime and sprint workflows by dhyana6466 · Pull Request #133 · oss-slu/oss_dev_analytics

dhyana6466 · 2026-02-20T22:00:24Z

Description

This PR refactors the existing data collection setup into two separate GitHub Action workflows: one for lifetime data and one for sprint-specific data.
The lifetime workflow runs twice a month and collects the full repository history to generate long-term health metrics. The sprint workflow runs on a schedule and only collects data within the active sprint window.
To make sprint handling easier for future updates, I added a sprint_schedule.json file where sprint start and end dates are hardcoded. This keeps sprint configuration separate from the main logic and makes it easier to update each semester.
The script collectData.py now accepts a --mode argument so that each workflow calls the correct logic (lifetime or sprint).
Both workflows are configured to push updates to the Data_Updates branch instead of main.

Fixes #131

Type of change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

How Has This Been Tested?

I tested both modes locally.

Lifetime mode:
python -m Backend.dataCollection.collectData --mode lifetime
This iterated through all issues in the repositories and successfully generated lifetime_data.json in the data/ folder.

Sprint mode:
python -m Backend.dataCollection.collectData --mode sprint
This reads from sprint_schedule.json, checks if today falls within a sprint window, and generates sprint_data.json when appropriate.

Both modes completed without errors and produced the expected JSON structure.

Test A
Test B

Test Configuration:

Language Version: Python 3.10
Local machine testing (MacOS)
GitHub Personal Access Token configured in .env

Checklist:

My code follows the style guidelines of this project
I have performed a self-review of my code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes
Any dependent changes have been merged and published in downstream modules

Screenshot of Output

.github/workflows/lifetime_data.yml

.github/workflows/sprint_data.yml

…in permissions Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

…n permissions Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

hcaballero2 · 2026-02-22T14:43:23Z

Great work overall @dhyana6466 ! I really love how you handled the different workflows! Would you be able to update your collectData.py script so that it uses the other tools in the dataCollection folder? You did a great job of making the functions to collect the necessary data but we have scripts that already do that. If there need to be edits made to these files please feel free!

dhyana6466 · 2026-02-22T18:41:39Z

Great work overall @dhyana6466 ! I really love how you handled the different workflows! Would you be able to update your collectData.py script so that it uses the other tools in the dataCollection folder? You did a great job of making the functions to collect the necessary data but we have scripts that already do that. If there need to be edits made to these files please feel free!

Thanks! That makes sense. I’ve updated collectData.py so it now uses the existing scripts inside the dataCollection folder instead of handling the aggregation directly. I moved the repo-level collection into a separate function (repositoryCollector.py) and now collectData.py just handles the workflow mode and writing to the JSON files. Let me know if you’d like me to adjust anything else!

hcaballero2 · 2026-02-22T22:57:36Z

Great job!

Refactor data collection into lifetime and sprint workflows

8ab19b4

github-advanced-security bot found potential problems Feb 20, 2026

View reviewed changes

.github/workflows/lifetime_data.yml Fixed Show fixed Hide fixed

.github/workflows/sprint_data.yml Fixed Show fixed Hide fixed

hcaballero2 and others added 2 commits February 22, 2026 08:32

Potential fix for code scanning alert no. 10: Workflow does not conta…

22f9b5b

…in permissions Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

Potential fix for code scanning alert no. 9: Workflow does not contai…

6701e53

…n permissions Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

Refactor collectData to use repositoryCollector abstraction

4d6b64e

hcaballero2 merged commit fb48877 into main Feb 22, 2026
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Refactor data collection into lifetime and sprint workflows#133

Refactor data collection into lifetime and sprint workflows#133
hcaballero2 merged 4 commits intomainfrom
feature/lifetime-sprint-workflows

dhyana6466 commented Feb 20, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

hcaballero2 commented Feb 22, 2026 •

edited

Loading

Uh oh!

dhyana6466 commented Feb 22, 2026

Uh oh!

hcaballero2 commented Feb 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

dhyana6466 commented Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

How Has This Been Tested?

Checklist:

Screenshot of Output

Uh oh!

Uh oh!

Uh oh!

hcaballero2 commented Feb 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dhyana6466 commented Feb 22, 2026

Uh oh!

hcaballero2 commented Feb 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

dhyana6466 commented Feb 20, 2026 •

edited

Loading

hcaballero2 commented Feb 22, 2026 •

edited

Loading