Skip to content

Add sameAs field to the Dandiset model`#364

Merged
yarikoptic merged 4 commits intomasterfrom
add-sameas
Feb 6, 2026
Merged

Add sameAs field to the Dandiset model`#364
yarikoptic merged 4 commits intomasterfrom
add-sameas

Conversation

@candleindark
Copy link
Member

This PR closes #358 and replaces #361.

This PR defines the sameAs field in the Dandiset model. It is a list of DANDI URLs of the Dandiset at other DANDI instances. It implements the solution proposed by #358 (comment).

@codecov
Copy link

codecov bot commented Jan 22, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 97.91%. Comparing base (be7d361) to head (419313b).
⚠️ Report is 5 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #364      +/-   ##
==========================================
+ Coverage   97.89%   97.91%   +0.01%     
==========================================
  Files          18       18              
  Lines        2379     2401      +22     
==========================================
+ Hits         2329     2351      +22     
  Misses         50       50              
Flag Coverage Δ
unittests 97.91% <100.00%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Defines a new sameAs field on the Dandiset model to represent DANDI URIs pointing to the same dandiset on other DANDI instances (per #358 / replaces #361).

Changes:

  • Added sameAs (optional list) to the Dandiset model with JSON schema metadata.
  • Refactored existing publishing-metadata test to reuse a shared base_dandiset_metadata fixture.
  • Added tests covering sameAs behavior for omitted/empty/valid/invalid inputs.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
dandischema/models.py Adds the new sameAs field to the Dandiset schema.
dandischema/tests/test_models.py Introduces a shared metadata fixture and adds coverage for sameAs validation.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@candleindark candleindark force-pushed the add-sameas branch 2 times, most recently from 96d486c to 419313b Compare January 22, 2026 23:54
@candleindark candleindark requested a review from Copilot January 23, 2026 06:00
@candleindark candleindark marked this pull request as ready for review January 23, 2026 06:07
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Member

@yarikoptic yarikoptic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@satra -- wdyt about this sameAs approach where we for now reserve just ability to point to other dandi:// instances etc, but in principle could indeed then open to other URLs.

json_schema_extra={"readOnly": True, "nskey": "schema"},
)

sameAs: Annotated[
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: I thought we had merged

already... apparently not, so later we would need to add handling of sameAs as well there

@yarikoptic
Copy link
Member

and actually we can identify instances based on our detection in dandi-cli as well

❯ python -c 'from dandi.utils import get_instance; print(get_instance("https://dandiarchive.org/dandiset/001700)"))'
DandiInstance(name='api.dandiarchive.org', gui='https://dandiarchive.org', api='https://api.dandiarchive.org/api')
❯ python -c 'from dandi.utils import get_instance; print(get_instance("dandi"))'
DandiInstance(name='dandi', gui='https://dandiarchive.org', api='https://api.dandiarchive.org/api')

(not sure why first one did not return 'dandi' name BTW)

so it doesn't have to be a dandi:// url per se

@yarikoptic
Copy link
Member

actually more specifically -- that function:

❯ python -c 'from dandi.dandiarchive import parse_dandi_url; print(parse_dandi_url("dandi://dandi/001700"))'
DandisetURL(instance=DandiInstance(name='dandi', gui='https://dandiarchive.org', api='https://api.dandiarchive.org/api'), dandiset_id='001700', version_id=None)
❯ python -c 'from dandi.dandiarchive import parse_dandi_url; print(parse_dandi_url("https://dandiarchive.org/dandiset/001700"))'
DandisetURL(instance=DandiInstance(name='dandi', gui='https://dandiarchive.org', api='https://api.dandiarchive.org/api'), dandiset_id='001700', version_id=None)

so we could point to any url dandi-cli understands and associate a

Comment on lines +1689 to +1690
rf"^dandi://{UNVENDORED_ID_PATTERN}/\d{{6}}"
rf"(@(draft|{VERSION_NUM_PATTERN}))?(/\S+)?$"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the only comment i have is whether this should be a DANDI specific sameAs or anywhere. say someone puts the same dataset on zenodo, does that get to be added here? or in related resources?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would keep it constrained to just point to our instances, while coding tools defensively (could be anything), and indeed referring people to use related for extra resources. Although, we could potentially use sameAs to point to DataLad dandisets here... WDYT?

@yarikoptic
Copy link
Member

@satra lets proceed?

@yarikoptic yarikoptic merged commit 07aa22c into master Feb 6, 2026
107 checks passed
@yarikoptic yarikoptic deleted the add-sameas branch February 6, 2026 21:30
yarikoptic added a commit that referenced this pull request Mar 4, 2026
…trings

- SIMPLE_DOWNGRADES: "0.6.11" -> "0.7.0" (no 0.6.11 was ever released)
- Add sameAs to downgrade fields (added on master via PR #364)
- Include str in empty-value check so releaseNotes="" is treated as empty
- Rewrite test to use DANDI_SCHEMA_VERSION and minimal metadata dict
  (0.6.11 was not in ALLOWED_INPUT_SCHEMAS, basic_publishmeta needed
  instance_name positional arg)
- Add test coverage for sameAs downgrade (empty list, non-empty list)

Co-Authored-By: Claude Code 2.1.63 / Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

add otherIdentifiers to Dandiset model

4 participants