Skip to content

Ap 2566 Smart Scheduling and Rollback Feature#163

Open
naomiwise wants to merge 21 commits intomainfrom
AP-2566-smart-scheduler
Open

Ap 2566 Smart Scheduling and Rollback Feature#163
naomiwise wants to merge 21 commits intomainfrom
AP-2566-smart-scheduler

Conversation

@naomiwise
Copy link
Copy Markdown

@naomiwise naomiwise commented Apr 20, 2026

Context

SmartScheduling Feature: Genetic Algorithm-Based Job Load Distribution

Summary

Implements SmartScheduling, a genetic algorithm-based optimization module that automatically distributes scheduled jobs across a 24-hour period to minimize resource conflicts and peak CPU load. This is the core feature for Cicada's ability to optimize job scheduling across distributed nodes.

The branch includes:

  • Complete GA optimization pipeline with PyGAD integration
  • Database schema and backup/rollback mechanism for optimization snapshots
  • CLI commands for running and managing smart scheduling
  • Documentation and Testing

Problem

On distributed job schedulers, all jobs on a server might naturally cluster at similar times (e.g., on the hour, at :00 or :30 minutes), causing resource spikes. Cicada needed a way to automatically shift job start times to spread load evenly across the day.

Solution

SmartScheduling uses a Genetic Algorithm to find near-optimal shift values for each job:

  • Represents each job as a "tap" with frequency, runtime, and CPU requirements
  • Evolves shift assignments over generations to minimize peak CPU load
  • Supports job frequency ranges: 1-60 minutes, hourly, and daily
  • Automatically excludes blacklisted, irregular and system schedules from optimization
  • Creates checkpoints for safe rollback if optimization degrades performance

Key Features

Architecture (cicada/lib/SmartScheduling/)

  • domain.py: Tap dataclass representing a schedulable job with cron parsing
  • config.py: GAConfig hyperparameters for the genetic algorithm
  • pygad.py: PyGAD wrapper with fitness function (evaluates resource contention)
  • evaluation.py: CPU load calculation and peak detection

Commands

  • smart_schedule.py: Main orchestrator; runs GA optimization and updates DB
  • rollback.py: Reverts optimization using checkpoint history

Database Changes

  • schedule_backups: Stores optimization snapshots with checkpoints for rollback
  • Tracks original, previous, and current cron expressions per schedule

Documentation

  • docs/Smart Scheduler Technical Overview.md - Algorithm details, hyperparameters, tuning guide
  • CLAUDE.md updated with SmartScheduling development guidance

Checklist

Copilot AI review requested due to automatic review settings April 20, 2026 14:55
@naomiwise naomiwise requested a review from a team as a code owner April 20, 2026 14:55
@platon-github-app-production
Copy link
Copy Markdown

Comment /request-review to automatically request reviews from the following teams:

You can also request review from a specific team by commenting /request-review team-name, or you can add a description with --notes "<message>"

💡 If you see something that doesn't look right, check the configuration guide.

@naomiwise naomiwise changed the title Ap 2566 Smart Scheduling Ap 2566 Smart Scheduling and Rollback Feature Apr 20, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Changes: New feature (1), Database change (1), Documentation update (1), Maintenance (1)

This PR introduces a “SmartScheduling” capability to Cicada, using a Genetic Algorithm to shift cron start times to reduce load spikes, along with database-backed checkpointing/rollback support and local-dev seed data.

Changes:

  • Add GA-based smart scheduling + rollback CLI commands and SmartScheduling module implementation.
  • Extend DB schema with schedule_backups + schedule_blacklist (and snapshot trigger) to support checkpointing/rollback.
  • Add docs (technical overview + diagrams) and local-dev SQL seeding; add new Python deps (numpy, pygad).

Reviewed changes

Copilot reviewed 15 out of 18 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
setup/schema.sql Adds snapshot trigger/function and new tables/indexes for backups + blacklist.
setup/create_test_tap_setup.sql Provides local/dev seed data for servers/schedules + blacklist examples.
setup.py Adds dependencies needed by the GA implementation.
local-dev/entrypoint.sh Loads the new seed SQL into local dev DB on startup.
docs/offspring-ga.png Diagram for GA offspring/crossover.
docs/genetic-algorithm-process-cycle.png Diagram for GA process cycle.
docs/Smart Scheduler Technical Overview.md Technical design/architecture documentation for SmartScheduling.
cicada/lib/scheduler.py Adds DB helpers for backups/blacklist + rollback/restore operations.
cicada/lib/SmartScheduling/config.py Defines GAConfig defaults.
cicada/lib/SmartScheduling/domain.py Introduces Tap domain object + cron-derived attributes.
cicada/lib/SmartScheduling/evaluation.py Adds fitness evaluation (CPU usage diff array + peak).
cicada/lib/SmartScheduling/pygad.py Implements GA solve loop using PyGAD.
cicada/commands/upsert_schedule.py Resets schedule backup baseline on upsert.
cicada/commands/smart_schedule.py New command to optimize schedules and write results/checkpoints.
cicada/commands/rollback.py New command to rollback schedules using schedule_backups.
cicada/commands/delete_schedule.py Deletes a schedule’s backup record as well.
cicada/cli.py Wires new commands into the CLI (partially).
CLAUDE.md Adds repository/architecture guidance including SmartScheduling overview.
Comments suppressed due to low confidence (2)

cicada/cli.py:44

  • test_functional_cli_entrypoint.py asserts the top-level cicada -h output, including the command list. Adding new commands here will change that output and will cause those CLI tests to fail unless they’re updated to include smart_schedule/rollback. Consider updating the test expectations (and adding coverage for the new subcommands’ -h output).
    def __init__(self):
        command_list = [
            "register_server",
            "list_server_schedules",
            "exec_server_schedules",
            "smart_schedule",
            "show_schedule",
            "upsert_schedule",
            "exec_schedule",
            "spread_schedules",
            "archive_schedule_log",
            "ping_slack",
            "list_schedule_ids",
            "delete_schedule",
            "version",
        ]

cicada/cli.py:44

  • rollback is implemented and imported, but it isn't listed in command_list. As a result cicada rollback ... will be rejected as an unrecognized command. Add rollback to command_list so it is discoverable and dispatchable like the other subcommands.
        command_list = [
            "register_server",
            "list_server_schedules",
            "exec_server_schedules",
            "smart_schedule",
            "show_schedule",
            "upsert_schedule",
            "exec_schedule",
            "spread_schedules",
            "archive_schedule_log",
            "ping_slack",
            "list_schedule_ids",
            "delete_schedule",
            "version",
        ]

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread cicada/lib/SmartScheduling/domain.py Outdated
Comment thread cicada/lib/scheduler.py Outdated
Comment thread setup/schema.sql Outdated
Comment thread cicada/lib/SmartScheduling/pygad.py Outdated
Comment thread cicada/commands/smart_schedule.py Outdated
Comment thread setup/create_test_tap_setup.sql Outdated
Comment thread docs/Smart Scheduler Technical Overview.md Outdated
Comment thread cicada/lib/scheduler.py Outdated
Comment thread cicada/lib/scheduler.py
Comment thread cicada/commands/smart_schedule.py
db_conn = postgres.db_cicada(dbname)
db_cur = db_conn.cursor()
scheduler.delete_schedule(db_cur, str(schedule_id))
scheduler.delete_schedule_backup(db_cur, str(schedule_id))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we recover if the schedule backup is always being deleted?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete_schedule is only called when the tap gets removed so that's when we'd also want to remove it from the schedule_backups as we no longer have a tap and so can't include it in smart scheduling. Naming here makes it confusing as the schedule_backups is the table purely for the backups of the previous cron schedules and is used to rollback to the previous cron, it's not to recreate a schedule if we delete it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants