Skip to content

Create a full production ready multiscales pipeline #428

Description

@sharkinsspatial

As part of their blog post Earthmover put together a nice reference implementation of multiscale generation https://github.com/earth-mover/icechunk-multiscales-demo. It is great as a reference but has a few limitations that don't make it suitable for usage with operational NASA datasets.

  • Coiled. Though we can use Coiled for quick cluster scaling experimentation it is unlikely to be adopted as a long term cluster solution for NASA sow we'll need to adapt this to use our own managed Batch clusters or potentially a Lambda approach depending on compute and memory requirements.
  • The demo approach is static and operational datasets will require hooks to a new chunk notification system that broadcasts notifications of chunk additions or updates to the native resolution arrays (or more likely virtual arrays) so that the multiscales pipeline can update regions accordingly.
  • We should include the ability to incrementally populate multiscale arrays on-demand based on user requests for less used regions / variables. See Warped chunk caching #427 for more details on this approach.

This is critical for visualization performance and wider adoption of virtualization as a basis for a more modern data system.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions