Skip to content

Enable search and data file access by passing Dataset ID for PangaeaDataset #32

Merged
khider merged 1 commit intomainfrom
pangaea-search
Mar 12, 2026
Merged

Enable search and data file access by passing Dataset ID for PangaeaDataset #32
khider merged 1 commit intomainfrom
pangaea-search

Conversation

@doswal
Copy link
Copy Markdown
Collaborator

@doswal doswal commented Mar 12, 2026

This PR adds support for direct loading of PANGAEA datasets using Study IDs (numeric IDs or DOI strings), bringing the PANGAEA provider closer to the behavior of the NOAA provider.

Previously, PangaeaDataset required users to:

  • Run search_studies() with a query
  • Inspect the returned summary
  • Select a StudyID
  • Call get_data() to retrieve the dataset

Unlike NOAA, PANGAEA could not directly load a dataset from its StudyID without a prior search.

What’s New

This PR introduces:

search_studies(study_ids=...)

Users can now load datasets directly using:

ds.search_studies(study_ids=830586)
ds.search_studies(study_ids="10.1594/PANGAEA.830586")
ds.search_studies(study_ids=[830586, "830587"])
  • Supports single ID or list
  • Accepts int or DOI string
  • Automatically normalizes IDs
  • Registers studies directly without invoking PanQuery

get_data(study_id) Auto-Loading

get_data() now supports:

ds.get_data(830586)
ds.get_data([830586, 830587])
  • If study is not registered → it is loaded automatically
  • If study is a collection → warning is logged and skipped

Always returns a list of pandas DataFrames (consistent with NOAA behavior)

The get_data related tests have also been modified to support this workflow

@khider khider merged commit b79dd73 into main Mar 12, 2026
1 check passed
@khider khider deleted the pangaea-search branch March 12, 2026 21:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants