-
Notifications
You must be signed in to change notification settings - Fork 724
[Bug][Pipeline][Gitextractor] Cancel pipeline returns 200 OK but pipeline keeps running #8787
Description
Search before asking
- I had searched in the issues and found no similar issues.
What happened
Clicking the cancel button in the UI (which fires DELETE /api/pipelines/:id) always returns
"Operation successfully completed" (HTTP 200), but the pipeline continues running. Observable
for 30+ minutes after the cancel request before it stops on its own.
This is a regression of #5585 and a go-git counterpart to #4188 (which only fixed libgit2).
Both were closed but the underlying causes were not fully addressed.
Three independent bugs combine to produce this behaviour:
Bug 1: CancelPipeline silently discards errors from CancelTask
server/services/pipeline.go:
for _, pendingTask := range pendingTasks {
_ = CancelTask(pendingTask.ID) // error thrown away
}CancelTask calls runningTasks.Remove(taskId). If the task is not in the in-memory map
(race between pipeline stages, task not yet registered, or pod restart), Remove returns
errors.NotFound. The error is discarded, cancel() is never called, the goroutine keeps
running, and the API still returns 200 OK.
A second consequence: tasks in future pipeline stages (TASK_CREATED) are never in
runningTasks, so CancelTask silently fails for all of them. They remain TASK_CREATED
in the database and the pipeline status stays TASK_RUNNING until the goroutine naturally
finishes.
Bug 2: storeRepoSnapshot in gitextractor ignores context cancellation (go-git path)
plugins/gitextractor/parser/repo_gogit.go:
func (r *GogitRepoCollector) storeRepoSnapshot(subtaskCtx plugin.SubTaskContext, commitList []*object.Commit) error {
ctx := subtaskCtx.GetContext()
for _, commit := range commitList { // ← no ctx.Done() check between commits
// ...
for _, p := range patch.Stats() {
blameResults, err := gogit.Blame(commit, fileName) // ← no context parametergogit.Blame() has no context parameter, it performs a full in-process blame computation
and cannot be interrupted. For large repositories with thousands of commits, each touching
many files, this loop runs for 30+ minutes and is completely unresponsive to context
cancellation. This is the primary cause of the long delay observed after pressing cancel.
Issue #4188 fixed the same problem for the libgit2 implementation (repo_libgit2.go) but
the go-git implementation was never addressed.
Bug 3: Cancelled tasks are marked TASK_FAILED instead of TASK_CANCELLED
core/runner/run_task.go: the deferred status update always writes TASK_FAILED when
err != nil, with no special case for context cancellation:
dbe := db.UpdateColumns(task, []dal.DalSet{
{ColumnName: "status", Value: models.TASK_FAILED}, // wrong for cancellations
...
})The final pipeline status also becomes TASK_FAILED or TASK_PARTIAL rather than
TASK_CANCELLED, making it impossible to distinguish a failed run from a cancelled one in
the UI or dashboards.
What do you expect to happen
What do you expect to happen
- Pressing cancel on a running pipeline stops it promptly (within seconds for HTTP-based plugins)
- The pipeline and all its tasks (running and not-yet-started) are immediately marked
TASK_CANCELLEDin the database - The API returns a non-200 or a meaningful error if cancellation could not be applied
- A cancelled run is distinguishable from a failed run in the UI
How to reproduce
How to reproduce
For the 30+ minute hang (Bug 2):
- Configure a blueprint with a large git repository (thousands of commits)
- Trigger the pipeline and wait for
collectCommits/ blame subtask to begin - Click cancel
- Observe: "Operation successfully completed" in the UI, but pipeline status stays
RUNNINGfor 30+ minutes
For the silent cancel failure (Bug 1):
- Run a multi-stage pipeline
- Click cancel immediately after one stage completes and before the next stage's tasks appear in progress
- Observe: cancel returns 200 OK, next stage starts and runs to completion
Anything else
Affected files:
| File | Issue |
|---|---|
server/services/pipeline.go:464 |
_ = CancelTask(...) silently discards errors; unstarted tasks never marked cancelled |
plugins/gitextractor/parser/repo_gogit.go:526 |
No ctx.Done() check in commit loop; gogit.Blame() has no context |
core/runner/run_task.go:91 |
Context-cancelled tasks written as TASK_FAILED instead of TASK_CANCELLED |
Suggested fixes:
- Log or return errors from
CancelTaskinstead of discarding them - In
CancelPipelinefor a running pipeline: immediately set allTASK_CREATEDtasks and the pipeline itself toTASK_CANCELLEDin the DB - In
storeRepoSnapshot: add actx.Done()check at the top of the commit loop; investigate whether go-git exposes a context-aware blame API - In
RunTask: useTASK_CANCELLEDwhenerrors.Is(err, context.Canceled)is true
Version
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project's Code of Conduct