-
Notifications
You must be signed in to change notification settings - Fork 1.3k
feat(orch): enforce pending todo completion before task finish #2841
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
amitksingh1490
wants to merge
33
commits into
main
Choose a base branch
from
todo-enforcer
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
33 commits
Select commit
Hold shift + click to select a range
95f328c
feat(orch): enforce pending todo completion before task finish
amitksingh1490 770dbeb
[autofix.ci] apply automated fixes
autofix-ci[bot] e07b304
Merge branch 'main' into todo-enforcer
tusharmath c846ddf
refactor(todos): hook PendingTodosHandler to inject reminders for inc…
amitksingh1490 4c4c617
Merge branch 'main' into todo-enforcer
amitksingh1490 d3c77e7
[autofix.ci] apply automated fixes
autofix-ci[bot] 1f9fc2b
Merge branch 'main' into todo-enforcer
tusharmath 98927b2
refactor(todos): enhance reminder system with structured todo items a…
amitksingh1490 da5a70b
[autofix.ci] apply automated fixes
autofix-ci[bot] 6882d55
Merge branch 'main' into todo-enforcer
amitksingh1490 349ae59
Merge branch 'main' into todo-enforcer
amitksingh1490 069fd50
refactor(todos): move PendingTodosHandler from on_response to on_end …
amitksingh1490 964f002
Merge remote-tracking branch 'origin/main' into todo-enforcer
amitksingh1490 09b830e
[autofix.ci] apply automated fixes
autofix-ci[bot] fd92041
[autofix.ci] apply automated fixes (attempt 2/3)
autofix-ci[bot] 4706b20
feat(config): add pending_todos_hook option to enable conditional tod…
amitksingh1490 afa7fec
[autofix.ci] apply automated fixes
autofix-ci[bot] 9f0f953
style(schema): reorder pending_todos_hook field alphabetically
amitksingh1490 f303933
Merge remote-tracking branch 'origin/main' into todo-enforcer
amitksingh1490 1e1c070
Merge branch 'main' into todo-enforcer
amitksingh1490 6a56f5a
refactor(config): rename pending_todos_hook to verify_todos and remov…
amitksingh1490 0a4cca1
Merge branch 'main' into todo-enforcer
amitksingh1490 8de592b
[autofix.ci] apply automated fixes
autofix-ci[bot] 15c2096
refactor(config): rename pending_todos_hook to verify_todos and updat…
amitksingh1490 b47d989
[autofix.ci] apply automated fixes
autofix-ci[bot] e86f054
[autofix.ci] apply automated fixes (attempt 2/3)
autofix-ci[bot] 4c85185
Merge remote-tracking branch 'origin/main' into todo-enforcer
amitksingh1490 f556471
feat(orch): fire response lifecycle hook in agent flow
tusharmath 2ab0014
refactor(orch): simplify End hook completion logic
tusharmath 1d9b2a1
fix(orch): always update conversation after End hook execution
amitksingh1490 49aadf8
feat(config): add verify_todos setting to forge config
tusharmath c80f7c2
fix(pending_todos): use text for system_reminder context message
tusharmath 0775d5a
Merge branch 'main' into todo-enforcer
tusharmath File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,9 +1,11 @@ | ||
| mod compaction; | ||
| mod doom_loop; | ||
| mod pending_todos; | ||
| mod title_generation; | ||
| mod tracing; | ||
|
|
||
| pub use compaction::CompactionHandler; | ||
| pub use doom_loop::DoomLoopDetector; | ||
| pub use pending_todos::PendingTodosHandler; | ||
| pub use title_generation::TitleGenerationHandler; | ||
| pub use tracing::TracingHandler; |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,272 @@ | ||
| use std::collections::HashSet; | ||
|
|
||
| use async_trait::async_trait; | ||
| use forge_domain::{ | ||
| ContextMessage, Conversation, EndPayload, EventData, EventHandle, Template, TodoStatus, | ||
| }; | ||
| use forge_template::Element; | ||
| use serde::Serialize; | ||
|
|
||
| use crate::TemplateEngine; | ||
|
|
||
| /// A single todo item prepared for template rendering. | ||
| #[derive(Serialize)] | ||
| struct TodoReminderItem { | ||
| status: &'static str, | ||
| content: String, | ||
| } | ||
|
|
||
| /// Template context for the pending-todos reminder. | ||
| #[derive(Serialize)] | ||
| struct PendingTodosContext { | ||
| todos: Vec<TodoReminderItem>, | ||
| } | ||
|
|
||
| /// Detects when the LLM signals task completion while there are still | ||
| /// pending or in-progress todo items. | ||
| /// | ||
| /// When triggered, it injects a formatted reminder listing all | ||
| /// outstanding todos into the conversation context, preventing the | ||
| /// orchestrator from yielding prematurely. | ||
| #[derive(Debug, Clone, Default)] | ||
| pub struct PendingTodosHandler; | ||
|
|
||
| impl PendingTodosHandler { | ||
| /// Creates a new pending-todos handler | ||
| pub fn new() -> Self { | ||
| Self | ||
| } | ||
| } | ||
|
|
||
| #[async_trait] | ||
| impl EventHandle<EventData<EndPayload>> for PendingTodosHandler { | ||
| async fn handle( | ||
| &self, | ||
| _event: &EventData<EndPayload>, | ||
| conversation: &mut Conversation, | ||
| ) -> anyhow::Result<()> { | ||
| let pending_todos = conversation.metrics.get_active_todos(); | ||
| if pending_todos.is_empty() { | ||
| return Ok(()); | ||
| } | ||
|
|
||
| // Build a set of current pending todo contents for comparison | ||
| let current_todo_set: HashSet<String> = pending_todos | ||
| .iter() | ||
| .map(|todo| todo.content.clone()) | ||
| .collect(); | ||
|
|
||
| // Check if we already have a reminder with the exact same set of todos | ||
| // This prevents duplicate reminders while still allowing new reminders | ||
| // when todos change (e.g., some completed but others still pending) | ||
| let should_add_reminder = if let Some(context) = &conversation.context { | ||
| // Find the most recent reminder message by looking for the template content | ||
| // pattern | ||
| let last_reminder_todos: Option<HashSet<String>> = context | ||
| .messages | ||
| .iter() | ||
| .rev() | ||
| .filter_map(|entry| { | ||
| let content = entry.message.content()?; | ||
| // Check if this is a pending todos reminder | ||
| if content.contains("You have pending todo items") { | ||
| // Extract todo items from the reminder message | ||
| // Format: "- [STATUS] Content" | ||
| let todos: HashSet<String> = content | ||
| .lines() | ||
| .filter(|line| line.starts_with("- [")) | ||
| .map(|line| { | ||
| // Extract content after "- [STATUS] " | ||
| line.split_once("] ") | ||
| .map(|x| x.1) | ||
| .map(|s| s.to_string()) | ||
| .unwrap_or_default() | ||
| }) | ||
| .collect(); | ||
| Some(todos) | ||
| } else { | ||
| None | ||
| } | ||
| }) | ||
| .next(); | ||
|
|
||
| match last_reminder_todos { | ||
| Some(last_todos) => last_todos != current_todo_set, | ||
| None => true, // No previous reminder found, should add | ||
| } | ||
| } else { | ||
| true // No context, should add reminder | ||
| }; | ||
|
|
||
| if !should_add_reminder { | ||
| return Ok(()); | ||
| } | ||
|
|
||
| let todo_items: Vec<TodoReminderItem> = pending_todos | ||
| .iter() | ||
| .filter_map(|todo| { | ||
| let status = match todo.status { | ||
| TodoStatus::Pending => "PENDING", | ||
| TodoStatus::InProgress => "IN_PROGRESS", | ||
| _ => return None, | ||
| }; | ||
| Some(TodoReminderItem { status, content: todo.content.clone() }) | ||
| }) | ||
| .collect(); | ||
|
|
||
| let ctx = PendingTodosContext { todos: todo_items }; | ||
| let reminder = TemplateEngine::default().render( | ||
| Template::<PendingTodosContext>::new("forge-pending-todos-reminder.md"), | ||
| &ctx, | ||
| )?; | ||
|
|
||
| if let Some(context) = conversation.context.as_mut() { | ||
| let content = Element::new("system_reminder").text(reminder); | ||
| context | ||
| .messages | ||
| .push(ContextMessage::user(content, None).into()); | ||
| } | ||
|
|
||
| Ok(()) | ||
| } | ||
| } | ||
|
|
||
| #[cfg(test)] | ||
| mod tests { | ||
| use forge_domain::{ | ||
| Agent, Context, Conversation, EndPayload, EventData, EventHandle, Metrics, ModelId, Todo, | ||
| TodoStatus, | ||
| }; | ||
| use pretty_assertions::assert_eq; | ||
|
|
||
| use super::*; | ||
|
|
||
| fn fixture_agent() -> Agent { | ||
| Agent::new( | ||
| "test-agent", | ||
| "test-provider".to_string().into(), | ||
| ModelId::new("test-model"), | ||
| ) | ||
| } | ||
|
|
||
| fn fixture_conversation(todos: Vec<Todo>) -> Conversation { | ||
| let mut conversation = Conversation::generate(); | ||
| conversation.context = Some(Context::default()); | ||
| conversation.metrics = Metrics::default().todos(todos); | ||
| conversation | ||
| } | ||
|
|
||
| fn fixture_event() -> EventData<EndPayload> { | ||
| EventData::new(fixture_agent(), ModelId::new("test-model"), EndPayload) | ||
| } | ||
|
|
||
| #[tokio::test] | ||
| async fn test_no_pending_todos_does_nothing() { | ||
| let handler = PendingTodosHandler::new(); | ||
| let event = fixture_event(); | ||
| let mut conversation = fixture_conversation(vec![]); | ||
|
|
||
| let initial_msg_count = conversation.context.as_ref().unwrap().messages.len(); | ||
| handler.handle(&event, &mut conversation).await.unwrap(); | ||
|
|
||
| let actual = conversation.context.as_ref().unwrap().messages.len(); | ||
| let expected = initial_msg_count; | ||
| assert_eq!(actual, expected); | ||
| } | ||
|
|
||
| #[tokio::test] | ||
| async fn test_pending_todos_injects_reminder() { | ||
| let handler = PendingTodosHandler::new(); | ||
| let event = fixture_event(); | ||
| let mut conversation = fixture_conversation(vec![ | ||
| Todo::new("Fix the build").status(TodoStatus::Pending), | ||
| Todo::new("Write tests").status(TodoStatus::InProgress), | ||
| ]); | ||
|
|
||
| handler.handle(&event, &mut conversation).await.unwrap(); | ||
|
|
||
| let actual = conversation.context.as_ref().unwrap().messages.len(); | ||
| let expected = 1; | ||
| assert_eq!(actual, expected); | ||
| } | ||
|
|
||
| #[tokio::test] | ||
| async fn test_reminder_contains_formatted_list() { | ||
| let handler = PendingTodosHandler::new(); | ||
| let event = fixture_event(); | ||
| let mut conversation = fixture_conversation(vec![ | ||
| Todo::new("Fix the build").status(TodoStatus::Pending), | ||
| Todo::new("Write tests").status(TodoStatus::InProgress), | ||
| ]); | ||
|
|
||
| handler.handle(&event, &mut conversation).await.unwrap(); | ||
|
|
||
| let entry = &conversation.context.as_ref().unwrap().messages[0]; | ||
| let actual = entry.message.content().unwrap(); | ||
| assert!(actual.contains("- [PENDING] Fix the build")); | ||
| assert!(actual.contains("- [IN_PROGRESS] Write tests")); | ||
| } | ||
|
|
||
| #[tokio::test] | ||
| async fn test_completed_todos_not_included() { | ||
| let handler = PendingTodosHandler::new(); | ||
| let event = fixture_event(); | ||
| let mut conversation = fixture_conversation(vec![ | ||
| Todo::new("Completed task").status(TodoStatus::Completed), | ||
| Todo::new("Cancelled task").status(TodoStatus::Cancelled), | ||
| ]); | ||
|
|
||
| let initial_msg_count = conversation.context.as_ref().unwrap().messages.len(); | ||
| handler.handle(&event, &mut conversation).await.unwrap(); | ||
|
|
||
| let actual = conversation.context.as_ref().unwrap().messages.len(); | ||
| let expected = initial_msg_count; | ||
| assert_eq!(actual, expected); | ||
| } | ||
|
|
||
| #[tokio::test] | ||
| async fn test_reminder_not_duplicated_for_same_todos() { | ||
| let handler = PendingTodosHandler::new(); | ||
| let event = fixture_event(); | ||
| let mut conversation = | ||
| fixture_conversation(vec![Todo::new("Fix the build").status(TodoStatus::Pending)]); | ||
|
|
||
| // First call should inject a reminder | ||
| handler.handle(&event, &mut conversation).await.unwrap(); | ||
| let after_first = conversation.context.as_ref().unwrap().messages.len(); | ||
| assert_eq!(after_first, 1); | ||
|
|
||
| // Second call with the same pending todos should NOT add another reminder | ||
| handler.handle(&event, &mut conversation).await.unwrap(); | ||
| let after_second = conversation.context.as_ref().unwrap().messages.len(); | ||
| assert_eq!(after_second, 1); // Still 1, no duplicate | ||
| } | ||
|
|
||
| #[tokio::test] | ||
| async fn test_reminder_added_when_todos_change() { | ||
| let handler = PendingTodosHandler::new(); | ||
| let event = fixture_event(); | ||
| let mut conversation = fixture_conversation(vec![ | ||
| Todo::new("Fix the build").status(TodoStatus::Pending), | ||
| Todo::new("Write tests").status(TodoStatus::InProgress), | ||
| ]); | ||
|
|
||
| // First call should inject a reminder | ||
| handler.handle(&event, &mut conversation).await.unwrap(); | ||
| let after_first = conversation.context.as_ref().unwrap().messages.len(); | ||
| assert_eq!(after_first, 1); | ||
|
|
||
| // Simulate LLM completing one todo but leaving another pending | ||
| // Update the conversation metrics with different todos | ||
| conversation.metrics = conversation.metrics.clone().todos(vec![ | ||
| Todo::new("Fix the build").status(TodoStatus::Completed), | ||
| Todo::new("Write tests").status(TodoStatus::InProgress), | ||
| Todo::new("Add documentation").status(TodoStatus::Pending), | ||
| ]); | ||
|
|
||
| // Second call with different pending todos should add a new reminder | ||
| handler.handle(&event, &mut conversation).await.unwrap(); | ||
| let after_second = conversation.context.as_ref().unwrap().messages.len(); | ||
| assert_eq!(after_second, 2); // New reminder added because todos changed | ||
| } | ||
| } | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we move the prompt from string literals to templates, and use the template engine to render.