Skip to content

[Bug][GitExtractor] Extraction fails for a non-existent PR #8817

@michelengelen

Description

@michelengelen

Search before asking

  • I had searched in the issues and found no similar issues.

What happened

The gitextractor started to fail for one of our repos recently.
It seems the reason is that it tries to collect data for a PR that does not exist (anymore) and after 3 retries exits with a failure message:

time="2026-03-29 12:33:22" level=error msg=" [pipeline service] [pipeline #828] [task #16537] attached stack trace\n\t  -- stack trace:\n\t  | github.com/apache/incubator-devlake/core/runner.RunPluginSubTasks\n\t  | \t/app/core/runner/run_task.go:333\n\t  | [...repeated from below...]\n\tWraps: (2) subtask Collect Pull Requests ended unexpectedly\n\tWraps: (3) attached stack trace\n\t  -- stack trace:\n\t  | github.com/apache/incubator-devlake/helpers/pluginhelper/api.(*WorkerScheduler).WaitAsync\n\t  | \t/app/helpers/pluginhelper/api/worker_scheduler.go:173\n\t  | github.com/apache/incubator-devlake/helpers/pluginhelper/api.(*ApiCollector).Execute\n\t  | \t/app/helpers/pluginhelper/api/api_collector.go:206\n\t  | github.com/apache/incubator-devlake/helpers/pluginhelper/api.(*StatefulApiCollector).Execute\n\t  | \t/app/helpers/pluginhelper/api/api_collector_stateful.go:97\n\t  | github.com/apache/incubator-devlake/plugins/github/tasks.CollectApiPullRequests\n\t  | \t/app/plugins/github/tasks/pr_collector.go:140\n\t  | github.com/apache/incubator-devlake/core/runner.runSubtask\n\t  | \t/app/core/runner/run_task.go:425\n\t  | github.com/apache/incubator-devlake/core/runner.RunPluginSubTasks\n\t  | \t/app/core/runner/run_task.go:330\n\t  | github.com/apache/incubator-devlake/core/runner.RunPluginTask\n\t  | \t/app/core/runner/run_task.go:165\n\t  | github.com/apache/incubator-devlake/core/runner.RunTask\n\t  | \t/app/core/runner/run_task.go:139\n\t  | github.com/apache/incubator-devlake/server/services.runTaskStandalone\n\t  | \t/app/server/services/task_runner.go:114\n\t  | github.com/apache/incubator-devlake/server/services.RunTasksStandalone.func1\n\t  | \t/app/server/services/task.go:187\n\t  | runtime.goexit\n\t  | \t/usr/local/go/src/runtime/asm_amd64.s:1598\n\tWraps: (4) Retry exceeded 3 times calling repos/mui/material-ui/pulls/47788. The last error was: Http DoAsync error calling [method:GET path:repos/mui/material-ui/pulls/47788 query:map[]]. Response: {\"message\":\"Not Found\",\"documentation_url\":\"https://docs.github.com/rest/pulls/pulls#get-a-pull-request\",\"status\":\"404\"} (404)\n\tError types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *withstack.withStack (4) *errutil.leafError"

I don't know how it even gets the id for this PR, since the list call I tried does not even return it.

gh pr list --limit 1000 --json number,title,state,author,createdAt does not include the PR in question.

gh pr view 47788 --json number,title,state,author,body,createdAt,mergedAt,files returns: GraphQL: Could not resolve to a PullRequest with the number of 47788. (repository.pullRequest)

What do you expect to happen

The extractor should finish and skip 404 status silently (maybe with a warning), but not fail the collection altogether

How to reproduce


Anything else

every time

Version

v1.0.3-beta10

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    component/pluginsThis issue or PR relates to pluginspriority/highThis issue is very importantseverity/p1This bug affects functionality or significantly affect uxtype/bugThis issue is a bug

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions