Skip to content

Distinct values from a cube #24

@longhotsummer

Description

@longhotsummer

We have a postgresql table with about 28 million facts with a financial_year column. Users can use the babbage API to essentially query the distinct financial_year values, which is about 10 unique values.

Postgresql seems to be very naive when doing SELECT DISTINCT financial_year FROM table because it runs a table scan even though financial_year has an index, which takes 60+ seconds. This seems to be a known problem with postgresql.

How have others solved this problem? Do we split out the financial_year data (and all the other dimensions of a fact) into a separate table?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions