Skip to content

User search returns incorrect results because GraphQL search does not support sort:followers for users #112

@unhappychoice

Description

@unhappychoice

Summary

The GraphQL user search query embeds sort:followers-desc in the search query string, but GitHub does not support the sort:followers qualifier in the search query string for user search. While sort: qualifiers exist for other search types (e.g., sort:comments, sort:created for issues, sort:author-date for commits), and the REST API Search Users endpoint supports sort=followers as a query parameter, the sort:followers qualifier in the query string is not recognized for user search. This causes the search to return results in an undefined "best match" order rather than by follower count, leading to missing users and non-deterministic rankings.

Details

1. sort:followers qualifier is not supported in user search query strings

The current implementation in github.go constructs a GraphQL search query like:

search(type: USER, query: "type:user location:japan sort:followers-desc", first: 5, after: "...")

However:

  • The Sorting search results documentation lists sort: qualifiers for issues/commits/repositories (e.g., sort:comments, sort:created, sort:updated, sort:reactions). There is no sort:followers qualifier listed.
  • The Searching users documentation lists the supported qualifiers for user search (type, in, repos, location, language, created, followers, is:sponsorable). The sort:followers qualifier is not among them.
  • The REST API Search Users endpoint, by contrast, does support sorting via a query parameter: sort can be followers, repositories, or joined. This is separate from the query string and is properly supported.

This means the current GraphQL-based search silently ignores the sort:followers-desc in the query string and returns results in an undefined "best match" order.

2. GitHub Search API is "best effort"

The REST API documentation explicitly states in Timeouts and incomplete results:

For queries that exceed the time limit, the API returns the matches that were already found prior to the timeout, and the response has the incomplete_results property set to true. Reaching a timeout does not necessarily mean that search results are incomplete.

During cursor pagination with GraphQL, results can shift between requests, causing duplicates and gaps.

3. minFollowerCount uses last user instead of minimum

In the pagination logic, minFollowerCount is set to the follower count of the last user processed in each batch:

minFollowerCount = int(followerCount)

This should compute min() across all users in the batch. Combined with the sort not working, the follower-range fallback logic (followers:<N) skips entire ranges of users.

Impact

These bugs cause the rankings to miss users who should appear. For example, users with sufficient followers and contributions may not appear in location-based rankings despite meeting the criteria.

Proposed Fix

Replace the GraphQL search approach with a two-step approach:

  1. REST API GET /search/users?sort=followers&order=desc — The REST API officially supports sort=followers as a query parameter and returns correctly sorted, deterministic results.
  2. GraphQL user(login:) batch queries — Fetch detailed user data (contributions, organizations) that the REST search endpoint does not provide.

A PR implementing this fix is at #113.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions