Skip to content

Conversation

@acwhite211
Copy link
Member

@acwhite211 acwhite211 commented Jan 12, 2026

Fixes #7551
Fixes #7617
Fixes #7626

This PR adds Django model definitions, constraints, and migrations for several legacy join tables and related entities that were present in existing Specify databases but missing from freshly initialized Sp7 databases.

These changes are based on a systematic comparison between an existing production database schema and a newly created schema, with the goal of ensuring that new databases accurately reflect the constraints and relationships relied on by legacy data and workflows.

Here is a link to an example of the difference between an Sp6 and Sp7 created database schema dump: https://www.diffchecker.com/qdMfXJCj/

After analyzing many different schema dumps, the schema differences between databases creation in Sp6 and Sp7, the following was found for constant differences:

16 Missing Foreign Key Constraints:

  • agent: FK (institutiontcid) -> institutionnetwork (institutionnetworkid)
  • collection: FK (institutionnetworkid) -> institutionnetwork (institutionnetworkid)
  • deaccessionpreparation: FK (createdbyagentid) -> agent (agentid)
  • deaccessionpreparation: FK (modifiedbyagentid) -> agent (agentid)
  • deaccessionpreparation: FK (deaccessionid) -> deaccession (deaccessionid)
  • deaccessionpreparation: FK (preparationid) -> preparation (preparationid)
  • project_colobj: FK (collectionobjectid) -> collectionobject (collectionobjectid)
  • project_colobj: FK (projectid) -> project (projectid)
  • sgrbatchmatchresultitem: FK (batchmatchresultsetid) -> sgrbatchmatchresultset (id) ON DELETE CASCADE
  • sgrbatchmatchresultset: FK (matchconfigurationid) -> sgrmatchconfiguration (id)
  • sp_schema_mapping: FK (spexportschemaid) -> spexportschema (spexportschemaid)
  • sp_schema_mapping: FK (spexportschemamappingid) -> spexportschemamapping (spexportschemamappingid)
  • specifyuser_spprincipal: FK (specifyuserid) -> specifyuser (specifyuserid)
  • specifyuser_spprincipal: FK (spprincipalid) -> spprincipal (spprincipalid)
  • spprincipal_sppermission: FK (sppermissionid) -> sppermission (sppermissionid)
  • spprincipal_sppermission: FK (spprincipalid) -> spprincipal (spprincipalid)

0 missing unique constraints were found, actually Sp7 create database had a few extra unique constrains compared to Sp6 created databases.

32 Missing Primary Key Constraints (Mostly due to unused tables not used in Sp7):
Missing / changed PRIMARY KEYs: 35

  • autonumsch_coll: PRIMARY KEY (collectionid, autonumberingschemeid)
  • autonumsch_div: PRIMARY KEY (divisionid, autonumberingschemeid)
  • autonumsch_dsp: PRIMARY KEY (disciplineid, autonumberingschemeid)
  • countryinfo: PRIMARY KEY (name)
  • deaccessionpreparation: PRIMARY KEY (deaccessionpreparationid)
  • dwcfish: PRIMARY KEY (dwcfishid)
  • dwcfishtissue: PRIMARY KEY (dwcfishtissueid)
  • dwckui: PRIMARY KEY (dwckuiid)
  • dwckuit: PRIMARY KEY (dwckuitid)
  • fishportalmapping: PRIMARY KEY (fishportalmappingid)
  • fwriportalmapping: PRIMARY KEY (fwriportalmappingid)
  • geoname: PRIMARY KEY (geonameid)
  • ios_colobjagents: PRIMARY KEY (oldid)
  • ios_colobjbio: PRIMARY KEY (oldid)
  • ios_colobjchron: PRIMARY KEY (oldid)
  • ios_colobjcnts: PRIMARY KEY (oldid)
  • ios_colobjgeo: PRIMARY KEY (oldid)
  • ios_colobjlitho: PRIMARY KEY (oldid)
  • ios_geogeo_cnt: PRIMARY KEY (oldid)
  • ios_geogeo_cty: PRIMARY KEY (oldid)
  • ios_geoloc: PRIMARY KEY (oldid)
  • ios_geoloc_cnt: PRIMARY KEY (oldid)
  • ios_geoloc_cty: PRIMARY KEY (oldid)
  • ios_taxon_pid: PRIMARY KEY (oldid)
  • project_colobj: PRIMARY KEY (projectid, collectionobjectid)
  • sgrbatchmatchresultitem: PRIMARY KEY (id)
  • sgrbatchmatchresultset: PRIMARY KEY (id)
  • sgrmatchconfiguration: PRIMARY KEY (id)
  • sp_schema_mapping: PRIMARY KEY (spexportschemamappingid, spexportschemaid)
  • specifyuser_spprincipal: PRIMARY KEY (specifyuserid, spprincipalid)
  • spprincipal_sppermission: PRIMARY KEY (sppermissionid, spprincipalid)
  • spstynthy: PRIMARY KEY (spstynthyid)
  • taxa2id: PRIMARY KEY (idtaxa2id)
  • tissue_web_search: PRIMARY KEY (tissue_web_searchid)
  • voucher_web_search: PRIMARY KEY (voucher_web_searchid)

The following tables were identified in the schema diff but were not added in this PR because they are legacy, client specific, or no longer used by current Specify workflows:

  • countryinfo
  • geoname
  • dwcfish
  • dwcfishtissue
  • dwckui
  • dwckuit
  • fishportalmapping
  • fwriportalmapping
  • spstynthy
  • tissue_web_search
  • voucher_web_search
  • ios_colobj*
  • ios_geogeo*
  • ios_geoloc*
  • ios_taxon_pid

These tables appear to not be required for new database creation or normal application operation, but let me know if any of these should be added.

Checklist

  • Self-review the PR after opening it to make sure the changes look good and
    self-explanatory (or properly documented)
  • Add relevant issue to release milestone
  • Add pr to documentation list
  • Add automated tests
  • Add a reverse migration if a migration is present in the PR

Testing instructions

  • Follow the process to create a new database in Specify 7, see that it completes without errors.
  • Check the schema dump of the newly created database to see that it contains all the new schema created in this PR. mariadb-dump -uroot -proot --no-data db_name > dbname_schema.sql

@acwhite211
Copy link
Member Author

I've added fixes for all of the unit tests that were failing in main. Currently, there are some unit tests failing specifically because of the new tables added from Specify 6 that have multi-field primary keys. This is causing issues with our django, datamodel, and sqlalchemy code. Working through different options for workarounds.

@acwhite211 acwhite211 marked this pull request as ready for review January 27, 2026 14:38
@acwhite211
Copy link
Member Author

Some of the new tables that need to be added from sp6 have multi-field primary keys, which is causing a lot of issues for our datamodel to sqlalchemy model generation.  I create a solution where these will be present in the Django model, but skip the creation for the sqlalchemy model so we can go ahead and finish this issue.  We can worry about the sqlalchemy version of the model another time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: 📋Back Log

2 participants