Skip to content

Fix context cancellation errors when fetching feature annotations #535

@cybersiddhu

Description

@cybersiddhu

Fix context cancellation errors when fetching feature annotations

Issue Description

The GraphQL server is reporting context cancellation errors when attempting to fetch feature annotations for PubMed IDs related to gene DDB_G0267444.

Error Logs

{"level":"error","msg":"error fetching feature annotations for pubmed ID 21109012 (related to gene DDB_G0267444): rpc error: code = Canceled desc = context canceled","time":"02/May/2025:14:54:21"}
{"level":"error","msg":"failed fetching details for pub 21109012: error fetching feature annotations for pubmed ID 21109012 (related to gene DDB_G0267444): rpc error: code = Canceled desc = context canceled","time":"02/May/2025:14:54:21"}
{"level":"error","msg":"error fetching feature annotations for pubmed ID 20820770 (related to gene DDB_G0267444): rpc error: code = Canceled desc = context canceled","time":"02/May/2025:14:54:21"}
{"level":"error","msg":"failed fetching details for pub 20820770: error fetching feature annotations for pubmed ID 20820770 (related to gene DDB_G0267444): rpc error: code = Canceled desc = context canceled","time":"02/May/2025:14:54:21"}

Root Cause

The error occurs in the fetchPublicationDetails function in internal/graphql/resolver/publication.go when calling ListFeatureAnnotationsByPubmedId. The context is being canceled before the feature annotation request completes, either due to:

  1. A timeout in the feature annotation gRPC service
  2. Error propagation in the concurrent processing model - when one goroutine fails, it's canceling the context for all other requests

Potential Solutions

  1. Add more robust error handling and prevent context cancellation from propagating across unrelated requests:

    • Modify fetchPublicationAsync to continue processing other publications when one fails
    • Consider using separate contexts for each publication request
  2. Implement retry logic with backoff for transient failures:

    • Add a retry mechanism for gRPC calls with Canceled error codes
    • Use an exponential backoff strategy to avoid overwhelming the service
  3. Increase timeout values if needed:

    • Review and possibly extend timeout settings for the feature annotation client
  4. Add detailed logging and metrics:

    • Track response times for feature annotation requests
    • Log timing information to identify potential bottlenecks

Affected Code

The issue is in internal/graphql/resolver/publication.go, specifically in the fetchPublicationDetails and fetchPublicationAsync functions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions