Skip to content

Tool for manipulating ro-crate file#1

Open
Alex-1919 wants to merge 41 commits intomainfrom
Alex
Open

Tool for manipulating ro-crate file#1
Alex-1919 wants to merge 41 commits intomainfrom
Alex

Conversation

@Alex-1919
Copy link
Collaborator

@Alex-1919 Alex-1919 commented Aug 4, 2025

This is the preliminary version of the tool, which has all the basic functionalities for manipulating ro-crate metadata files with validation. Part of the MUST validations, the generator for the preview.html, a note/guides for using the tools to uphold the ro-crate standard and a more holistic readme file with the php doc are supposed to be patched later. To run the unit tests, please run the command docker-compose run test. To run the index.php code that creates the example ro-crate-metadata.json and ro-crate-preview.html in resources directory based on example dataset 102736, run the command docker-compose up -d --build and refresh the page http://localhost:8080/.

Summary by CodeRabbit

  • New Features

    • Full RO‑Crate toolkit: create/load/validate/save crates, JSON read/write, flatten/unflatten utilities, in-memory JSON container, static HTML preview generator, CLI/demo scripts, and GigaDB example workflows.
  • Documentation

    • Major README overhaul, new Guide, CHANGELOG, and usage/install instructions including HTML rendering and examples.
  • Tests

    • Comprehensive PHPUnit suite covering entities, JSON utilities, preview generation, and validators.
  • Chores

    • Composer, Docker and CI/test configs, coding-standard setup, assets, resources, and .gitignore updates.

Wai Lok Tsang added 16 commits June 16, 2025 15:27
…tests, enable adding and removal of a single pair with chaining, removed Person class from main code instead use generic class (Person class only remains for convenience during testing)
…e, and enable the chaining of adding and removal of entities of a crate in addition the previously enabled chaining of adding and removal of properties of an entity.
…to further abstract away the details of the formatting. Fixed the ISO DateTime validation
@pli888 pli888 moved this to Ready For Review in Backlog: Pull Requests Aug 11, 2025
@rija rija requested review from kencho51, pli888 and rija August 11, 2025 16:05
@rija
Copy link
Collaborator

rija commented Aug 12, 2025

@coderabbitai summary

@coderabbitai
Copy link

coderabbitai bot commented Aug 12, 2025

✅ Actions performed

Summary regeneration triggered.

@coderabbitai
Copy link

coderabbitai bot commented Aug 12, 2025

Walkthrough

Adds project scaffolding (Composer, Docker, PHPCS, PHPUnit), a new ROCrate PHP library (entity model, crate logic, preview generator), JSON utilities and exceptions, many RO‑Crate resource/asset files, an example entry script, and extensive PHPUnit tests. No breaking changes to existing public APIs.

Changes

Cohort / File(s) Summary
Repository config & docs
composer.json, phpunit.xml, phpcs.xml, README.md, Guide.md, CHANGELOG.md, \.gitignore
Add Composer package metadata, runtime and dev dependencies, PSR‑4 autoloading, scripts; add PHPUnit and PHPCS configs; add CHANGELOG and Guide; expand README; update .gitignore with .vscode, vendor, .phpunit.result.cache.
Docker
docker-compose.yml, docker/Dockerfile, docker/php.ini
Add docker-compose services (php-app, test), Dockerfile for PHP 8.2 + Apache + Composer + Xdebug, and docker/php.ini Xdebug settings.
Core ROCrate library
src/rocrate/*
Introduce core classes: abstract Entity, ContextualEntity, DataEntity, Descriptor, concrete entities (Dataset, File, Person, Organization, Place, ContactPoint, Publication), JsonData, main ROCrate class (load/save/validate/validators) and ROCratePreviewGenerator.
JSON utilities & exceptions
src/Json/*, src/exceptions/*
Add Json\FileHandler (read/write with JsonFileException), Flattener, Unflattener, and exceptions JsonFileException and ROCrateException.
Example & entry script
src/index.php
Add an example/entry script demonstrating JSON read/flatten/unflatten, preview generation, and a comprehensive GigaDB crate construction example.
Resources & assets
resources/*, assets/*
Add multiple RO‑Crate JSON-LD manifests (v1.1 & v1.2), generated outputs, static preview HTML artifacts, and sample metadata files (e.g., resources/ro-crate-metadata*.json, resources/output.json, resources/ro-crate-preview*.html, assets/ro-crate-*.json, assets/ro-crate-preview.html).
Tests
tests/*
Add PHPUnit test suites covering JsonData, FileHandler, Flattener/Unflattener, entity types (File, Dataset, Person, Organization, Place, ContactPoint, Publication), Descriptor/DataEntity/ContextualEntity, ROCrate behavior and validators, and the preview generator.
Static assets
assets/ro-crate-metadata.json, assets/ro-crate-preview.html
Add alternative asset copies of RO‑Crate manifests and a static preview HTML page for dataset 102736.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Developer
  participant FS as Filesystem
  participant RC as ROCrate
  participant JH as Json\FileHandler
  participant PG as ROCratePreviewGenerator

  Developer->>RC: __construct(directory, loadExisting?, attached?, preview?)
  alt load existing
    RC->>FS: read `ro-crate-metadata.json`
    RC->>RC: loadMetadata() → build descriptor, root, entities
  else new crate
    RC->>RC: initializeNewCrate()
  end
  Developer->>RC: addEntity() / addPropertyPair(...)
  Developer->>JH: readJsonFile(path)
  JH-->>Developer: parsed array or throw JsonFileException
  Developer->>RC: validate()
  RC-->>Developer: errors[]
  Developer->>RC: save(path?, prefix?)
  RC->>FS: write `ro-crate-metadata-out.json` (and optional preview HTML)
  Developer->>PG: generatePreview(directory)
  PG->>FS: read `ro-crate-metadata.json`
  PG->>PG: indexEntities(), findRootEntity(), generateHTML()
  PG->>FS: write `ro-crate-preview.html`
  PG-->>Developer: completion / status
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

I twitched my nose at JSON keys,
I hopped through crates and nested trees.
I flattened dots and stitched them back,
Built previews, tests, docs in a stack.
A happy rabbit applauds the pack. 🐇

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title "Tool for manipulating ro-crate file" accurately reflects the primary purpose of the changeset — adding a PHP tool/library and utilities to create, modify, and preview RO‑Crate metadata — and is concise and relevant for a reviewer scanning history. It therefore meets the criteria for a passing title check.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch Alex

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

  • Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
  • Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.

Please see the documentation for more information.

Example:

reviews:
  pre_merge_checks:
    custom_checks:
      - name: "Undocumented Breaking Changes"
        mode: "warning"
        instructions: |
          Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).

Please share your feedback with us on this Discord post.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai
Copy link

coderabbitai bot commented Aug 12, 2025

✅ Actions performed

Summary regeneration triggered.

Copy link
Collaborator

@rija rija left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @Alex-1919,

Below is an initial pass at reviewing this PR focusing on repo/coding related observations.
We will make more passes at reviewing later focusing on other aspects.

Our review comments will be adhering to the reviewing standard from: https://conventionalcomments.org/

praise: Code organisation looks good and good use of PHP namespaces
praise: Functions signatures are type-hinted and have PHPDocs

issue: .vscode should not be in the repository as it is specific to your developer’s environment
suggestion: Delete the directory from git (git rm -f ...) and add .vscode to .gitignore

issue: vendor directory should not be in the repository as those dependencies are not part of the project and can be reinstalled by developers
suggestion: Delete the directory from git (git rm -f ...) and add vendor to .gitignore (not /vendor/ like you wrote)
(you will need to run composer install to create the vendor directory for your specific checkout)

note: composer.lock and composer.json should stay part of the project as that what’s allow the rest of the team to run his project with the same dependencies and version of them after they run composer install.

issue: .phpunit.result.cache should not be in the repository
suggestion: Delete the file from git (git rm -f ...) and add .phpunit.result.cache to .gitignore (not /.phpunit.result.cache like you wrote)

issue: The code is not conforming to PSR-12 coding standard we use in the team: https://www.php-fig.org/psr/psr-12/
suggestion: Install PHPCS: https://github.com/PHPCSStandards/PHP_CodeSniffer and run it on the src and tests directories of this PR

$ phpcs --standard=PSR12 src
$ phpcs --standard=PSR12 tests

Fix the errors and warnings listed. The command phpcbf that’s part of PHP Code Sniffer can auto-fix some of them.

issue: in composer.json the package name and licence are not correct
suggestion: use gigascience/ro-crate-php for name: and Apache-2.0 for license:

suggestion: you should add a “test” container service to docker-compose.yml, so that we can run the tests easily with:

$ docker-composer run —rm test

praise: Good test coverage

@github-project-automation github-project-automation bot moved this from Ready For Review to Changes Required in Backlog: Pull Requests Aug 12, 2025
@kencho51
Copy link
Collaborator

Hi @Alex-1919

Also worth to look into https://www.conventionalcommits.org/en/v1.0.0/, we use it for our commit.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (4)
assets/ro-crate-preview.html (4)

204-218: Normalize section ids that begin with “#” and update inbound links.

HTML id attributes should not include a leading “#”; keep “#” only in href. Current pattern forces brittle href="##...".
[ suggest_optional_refactor ]

Apply this diff to normalize three sections here; similar normalization is needed for all other sections whose id currently starts with “#” (see Lines 213-215, 225-227, 246-251, 252-254, 270-311, 294-306):

-            <section id="#oneExtra">
+            <section id="oneExtra">
@@
-                    <ul><li><span class="property"> @id </span>: #oneExtra</li><li><span class="property"> @type </span>: PropertyValue</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: Extra Information</li><li><span class="property">value <a href=http://schema.org/value> [?] </a> </span>: Data Type: Readme, File Attributes: MD5 checksum: 450ef019cf8ba58beb644ef18d1411d0</li></ul>                </section>
+                    <ul><li><span class="property"> @id </span>: #oneExtra</li><li><span class="property"> @type </span>: PropertyValue</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: Extra Information</li><li><span class="property">value <a href=http://schema.org/value> [?] </a> </span>: Data Type: Readme, File Attributes: MD5 checksum: 450ef019cf8ba58beb644ef18d1411d0</li></ul>                </section>
-            <section id="#twoExtra">
+            <section id="twoExtra">
@@
-                    <ul><li><span class="property"> @id </span>: #twoExtra</li><li><span class="property"> @type </span>: PropertyValue</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: Extra Information</li><li><span class="property">value <a href=http://schema.org/value> [?] </a> </span>: Data Type: Tabular data, File Attributes: MD5 checksum: 97ee210d263c783e4ddfe20352831d60 Figure in MS: 3</li></ul>                </section>
+                    <ul><li><span class="property"> @id </span>: #twoExtra</li><li><span class="property"> @type </span>: PropertyValue</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: Extra Information</li><li><span class="property">value <a href=http://schema.org/value> [?] </a> </span>: Data Type: Tabular data, File Attributes: MD5 checksum: 97ee210d263c783e4ddfe20352831d60 Figure in MS: 3</li></ul>                </section>
-            <section id="#other-files">
+            <section id="other-files">
@@
-                    <ul><li><span class="property"> @id </span>: #other-files</li><li><span class="property"> @type </span>: Dataset</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: other files</li><li><span class="property">description <a href=http://schema.org/description> [?] </a> </span>: This dataset contains too many files that are not individually described</li></ul>                </section>
+                    <ul><li><span class="property"> @id </span>: #other-files</li><li><span class="property"> @type </span>: Dataset</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: other files</li><li><span class="property">description <a href=http://schema.org/description> [?] </a> </span>: This dataset contains too many files that are not individually described</li></ul>                </section>

Follow-up: Update all inbound links to these sections to use href="#oneExtra", href="#twoExtra", href="#other-files" (see Lines 194, 197, and within the Root section).


194-197: Update exifData links to normalized fragment ids.

After normalizing section ids (Line 204+), fix these hrefs accordingly and drop “##”.

Apply this diff:

-                    <ul><li><span class="property"> @id </span>: <a href=https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/readme_102736.txt> https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/readme_102736.txt </a></li><li><span class="property"> @type </span>: File</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: readme_102736.txt</li><li><span class="property">contentSize <a href=http://schema.org/contentSize> [?] </a> </span>: 9.30 kB</li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: text/txt</li><li><span class="property">exifData <a href=http://schema.org/exifData> [?] </a> </span>: <a href=##oneExtra> Extra Information </a></li></ul>                </section>
+                    <ul><li><span class="property"> @id </span>: <a href=https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/readme_102736.txt> https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/readme_102736.txt </a></li><li><span class="property"> @type </span>: File</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: readme_102736.txt</li><li><span class="property">contentSize <a href=http://schema.org/contentSize> [?] </a> </span>: 9.30 kB</li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: text/txt</li><li><span class="property">exifData <a href=http://schema.org/exifData> [?] </a> </span>: <a href="#oneExtra"> Extra Information </a></li></ul>                </section>
@@
-                    <ul><li><span class="property"> @id </span>: <a href=https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/boostnano_no_dorado_R1_tails.csv> https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/boostnano_no_dorado_R1_tails.csv </a></li><li><span class="property"> @type </span>: File</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: boostnano_no_dorado_R1_tails.csv</li><li><span class="property">contentSize <a href=http://schema.org/contentSize> [?] </a> </span>: 317.24 kB</li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: text/csv</li><li><span class="property">description <a href=http://schema.org/description> [?] </a> </span>: PolyA tail lengths as found by Boostnano for R1 sequins which were filtered out by Dorado but kept by Boostnano; underlying data for figure 3</li><li><span class="property">exifData <a href=http://schema.org/exifData> [?] </a> </span>: <a href=##twoExtra> Extra Information </a></li></ul>                </section>
+                    <ul><li><span class="property"> @id </span>: <a href=https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/boostnano_no_dorado_R1_tails.csv> https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/boostnano_no_dorado_R1_tails.csv </a></li><li><span class="property"> @type </span>: File</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: boostnano_no_dorado_R1_tails.csv</li><li><span class="property">contentSize <a href=http://schema.org/contentSize> [?] </a> </span>: 317.24 kB</li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: text/csv</li><li><span class="property">description <a href=http://schema.org/description> [?] </a> </span>: PolyA tail lengths as found by Boostnano for R1 sequins which were filtered out by Dorado but kept by Boostnano; underlying data for figure 3</li><li><span class="property">exifData <a href=http://schema.org/exifData> [?] </a> </span>: <a href="#twoExtra"> Extra Information </a></li></ul>                </section>

204-218: Apply the “strip leading # from id” rule across all sections.

Beyond Lines 204–218, several other sections still use id="#...". Normalize consistently and update inbound hrefs accordingly.

If helpful, I can provide a patch to the generator (src/rocrate/ROCratePreviewGenerator.php) to:

  • Strip a leading “#” when rendering HTML id,
  • Preserve the original @id value inside the content body,
  • Emit href="#" consistently.

Confirm after changes:

  • No remaining href="##..." in the output,
  • All href fragments resolve to an existing section id.

Also applies to: 213-215, 225-227, 246-251, 252-254, 270-311, 294-306


185-187: Fix malformed list markup and broken “##” anchors in Root entity.

Current markup mixes stray commas and closing tags outside list items and uses href=##... anchors. This breaks HTML semantics and navigation.

Replace the whole Root Data Entity block with well-formed markup (anchors normalized to fragments that actually exist in the page):

-            <section id="https://gigadb.org/dataset/102736">
-                <h2>Root Data Entity</h2>
-                <ul><li><span class="property"> @id </span>: <a href=https://gigadb.org/dataset/102736> https://gigadb.org/dataset/102736 </a></li><li><span class="property"> @type </span>: Dataset</li><li><span class="property">identifier <a href=http://schema.org/identifier> [?] </a> </span>: <a href=#https://doi.org/10.5524/102736> https://doi.org/10.5524/102736 </a></li><li><span class="property">cite-as <a href=http://www.iana.org/assignments/relation/cite-as> [?] </a> </span>: <a href=#https://doi.org/10.5524/102736> https://doi.org/10.5524/102736 </a></li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: Supporting data for &quot;Using synthetic RNA to benchmark poly(A) length inference from direct RNA sequencing.&quot;</li><li><span class="property">description <a href=http://schema.org/description> [?] </a> </span>: Polyadenylation is a dynamic process which is important in cellular physiology. Oxford Nanopore Technologies direct RNA-sequencing provides a strategy for sequencing the full-length RNA molecule and analysis of the transcriptome and epi-transcriptome. There are currently several tools available for poly(A) tail-length estimation, including well-established tools such as tailfindr and nanopolish, as well as two more recent deep learning models: Dorado and BoostNano. However, there has been limited benchmarking of the accuracy of these tools against gold-standard datasets. In this paper we evaluate four poly(A) estimation tools using synthetic RNA standards (Sequins), which have known poly(A) tail-lengths and provide a valuable approach to measuring the accuracy of poly(A) tail-length estimation. All four tools generate mean tail-length estimates which lie within 12% of the correct value. Overall, Dorado is recommended as the preferred approach due to its relatively fast run times, low coefficient of variation and ease of use with integration with base-calling.</li><li><span class="property">datePublished <a href=http://schema.org/datePublished> [?] </a> </span>: 2025-07-29</li><li><span class="property">sdDatePublished <a href=http://schema.org/sdDatePublished> [?] </a> </span>: 2025-07-29</li><li><span class="property">publisher <a href=http://schema.org/publisher> [?] </a> </span>: <a href=#https://gigadb.org/> GigaScience DataBase </a></li><li><span class="property">sdPublisher <a href=http://schema.org/sdPublisher> [?] </a> </span>: <a href=#https://gigadb.org/> GigaScience DataBase </a></li><li><span class="property">license <a href=http://schema.org/license> [?] </a> </span>: <a href=#https://creativecommons.org/publicdomain/zero/1.0/> Creative Commons Zero v1.0 Universal </a></li><li><span class="property">thumbnail <a href=http://schema.org/thumbnail> [?] </a> </span>: <a href=https://assets.gigadb-cdn.net/live/images/datasets/32d9369e-500d-5347-8842-9fe46cdc3693/102736.png> https://assets.gigadb-cdn.net/live/images/datasets/32d9369e-500d-5347-8842-9fe46cdc3693/102736.png </a></li><li><span class="property">hasPart <a href=http://schema.org/hasPart> [?] </a> </span>: <a href=#https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/readme_102736.txt> readme_102736.txt </a></li>,  <a href=#https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/boostnano_no_dorado_R1_tails.csv> boostnano_no_dorado_R1_tails.csv </a></li>,  <a href=#https://gigadb.org/dataset/view/id/102736/Files_page/4> BoostNano-master </a></li>,  <a href=##other-files> other files </a></li><li><span class="property">author <a href=http://schema.org/author> [?] </a> </span>: <a href=#https://orcid.org/0000-0001-9083-6757> Chang JJ </a></li>,  <a href=##Xuan_Yang> Yang X </a></li>,  <a href=#https://orcid.org/0000-0003-0337-8722> Teng H </a></li>,  <a href=##Benjamin_Reames> Reames B </a></li>,  <a href=#https://orcid.org/0000-0003-1155-0959> Corbin V </a></li>,  <a href=#https://orcid.org/0000-0002-4300-455X> Coin LJM </a></li><li><span class="property">citation <a href=http://schema.org/citation> [?] </a> </span>: <a href=#https://doi.org/10.5524/100425> https://doi.org/10.5524/100425 </a></li><li><span class="property">funder <a href=http://schema.org/funder> [?] </a> </span>: <a href=#https://ror.org/011kf5r70> National Health and Medical Research Council </a></li><li><span class="property">exifData <a href=http://schema.org/exifData> [?] </a> </span>: <a href=##datasetTypes> Dataset type </a></li>,  <a href=##additionalInfo1> Additional information </a></li>,  <a href=##additionalInfo2> Additional information </a></li>,  <a href=##additionalInfo3> Additional information </a></li>,  <a href=##additionalInfo4> Additional information </a></li>,  <a href=##additionalInfo5> Additional information </a></li>,  <a href=##additionalInfo6> Additional information </a></li>,  <a href=##additionalInfo7> Additional information </a></li>,  <a href=##additionalInfo8> Additional information </a></li>,  <a href=##githubLink1> Github links </a></li>,  <a href=##githubLink2> Github links </a></li>,  <a href=##githubLink3> Github links </a></li>,  <a href=##githubLink4> Github links </a></li>,  <a href=##accessions> Accessions (data not in GigaDB) </a></li>,  <a href=##history> History </a></li><li><span class="property">keywords <a href=http://schema.org/keywords> [?] </a> </span>: oxford nanopore technologies, poly(a) tail, estimation, segmentation, direct rna sequencing</li><li><span class="property">about <a href=http://schema.org/about> [?] </a> </span>: <a href=#https://nanoporetech.com/> oxford nanopore technologies </a></li></ul>            </section>
+            <section id="https://gigadb.org/dataset/102736">
+                <h2>Root Data Entity</h2>
+                <ul>
+                  <li><span class="property"> @id </span>: <a href=https://gigadb.org/dataset/102736> https://gigadb.org/dataset/102736 </a></li>
+                  <li><span class="property"> @type </span>: Dataset</li>
+                  <li><span class="property">identifier <a href=http://schema.org/identifier> [?] </a> </span>: <a href=#https://doi.org/10.5524/102736> https://doi.org/10.5524/102736 </a></li>
+                  <li><span class="property">cite-as <a href=http://www.iana.org/assignments/relation/cite-as> [?] </a> </span>: <a href=#https://doi.org/10.5524/102736> https://doi.org/10.5524/102736 </a></li>
+                  <li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: Supporting data for &quot;Using synthetic RNA to benchmark poly(A) length inference from direct RNA sequencing.&quot;</li>
+                  <li><span class="property">description <a href=http://schema.org/description> [?] </a> </span>: Polyadenylation is a dynamic process which is important in cellular physiology. Oxford Nanopore Technologies direct RNA-sequencing provides a strategy for sequencing the full-length RNA molecule and analysis of the transcriptome and epi-transcriptome. There are currently several tools available for poly(A) tail-length estimation, including well-established tools such as tailfindr and nanopolish, as well as two more recent deep learning models: Dorado and BoostNano. However, there has been limited benchmarking of the accuracy of these tools against gold-standard datasets. In this paper we evaluate four poly(A) estimation tools using synthetic RNA standards (Sequins), which have known poly(A) tail-lengths and provide a valuable approach to measuring the accuracy of poly(A) tail-length estimation. All four tools generate mean tail-length estimates which lie within 12% of the correct value. Overall, Dorado is recommended as the preferred approach due to its relatively fast run times, low coefficient of variation and ease of use with integration with base-calling.</li>
+                  <li><span class="property">datePublished <a href=http://schema.org/datePublished> [?] </a> </span>: 2025-07-29</li>
+                  <li><span class="property">sdDatePublished <a href=http://schema.org/sdDatePublished> [?] </a> </span>: 2025-07-29</li>
+                  <li><span class="property">publisher <a href=http://schema.org/publisher> [?] </a> </span>: <a href=#https://gigadb.org/> GigaScience DataBase </a></li>
+                  <li><span class="property">sdPublisher <a href=http://schema.org/sdPublisher> [?] </a> </span>: <a href=#https://gigadb.org/> GigaScience DataBase </a></li>
+                  <li><span class="property">license <a href=http://schema.org/license> [?] </a> </span>: <a href=#https://creativecommons.org/publicdomain/zero/1.0/> Creative Commons Zero v1.0 Universal </a></li>
+                  <li><span class="property">thumbnail <a href=http://schema.org/thumbnail> [?] </a> </span>: <a href=https://assets.gigadb-cdn.net/live/images/datasets/32d9369e-500d-5347-8842-9fe46cdc3693/102736.png> https://assets.gigadb-cdn.net/live/images/datasets/32d9369e-500d-5347-8842-9fe46cdc3693/102736.png </a></li>
+                  <li><span class="property">hasPart <a href=http://schema.org/hasPart> [?] </a> </span>:
+                    <a href=#https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/readme_102736.txt> readme_102736.txt </a>,
+                    <a href=#https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/boostnano_no_dorado_R1_tails.csv> boostnano_no_dorado_R1_tails.csv </a>,
+                    <a href=#https://gigadb.org/dataset/view/id/102736/Files_page/4> BoostNano-master </a>,
+                    <a href="#other-files"> other files </a>
+                  </li>
+                  <li><span class="property">author <a href=http://schema.org/author> [?] </a> </span>:
+                    <a href=#https://orcid.org/0000-0001-9083-6757> Chang JJ </a>,
+                    <a href="#Xuan_Yang"> Yang X </a>,
+                    <a href=#https://orcid.org/0000-0003-0337-8722> Teng H </a>,
+                    <a href="#Benjamin_Reames"> Reames B </a>,
+                    <a href=#https://orcid.org/0000-0003-1155-0959> Corbin V </a>,
+                    <a href=#https://orcid.org/0000-0002-4300-455X> Coin LJM </a>
+                  </li>
+                  <li><span class="property">citation <a href=http://schema.org/citation> [?] </a> </span>: <a href=#https://doi.org/10.5524/100425> https://doi.org/10.5524/100425 </a></li>
+                  <li><span class="property">funder <a href=http://schema.org/funder> [?] </a> </span>: <a href=#https://ror.org/011kf5r70> National Health and Medical Research Council </a></li>
+                  <li><span class="property">exifData <a href=http://schema.org/exifData> [?] </a> </span>:
+                    <a href="#datasetTypes"> Dataset type </a>,
+                    <a href="#additionalInfo1"> Additional information </a>,
+                    <a href="#additionalInfo2"> Additional information </a>,
+                    <a href="#additionalInfo3"> Additional information </a>,
+                    <a href="#additionalInfo4"> Additional information </a>,
+                    <a href="#additionalInfo5"> Additional information </a>,
+                    <a href="#additionalInfo6"> Additional information </a>,
+                    <a href="#additionalInfo7"> Additional information </a>,
+                    <a href="#additionalInfo8"> Additional information </a>,
+                    <a href="#githubLink1"> Github links </a>,
+                    <a href="#githubLink2"> Github links </a>,
+                    <a href="#githubLink3"> Github links </a>,
+                    <a href="#githubLink4"> Github links </a>,
+                    <a href="#accessions"> Accessions (data not in GigaDB) </a>,
+                    <a href="#history"> History </a>
+                  </li>
+                  <li><span class="property">keywords <a href=http://schema.org/keywords> [?] </a> </span>: oxford nanopore technologies, poly(a) tail, estimation, segmentation, direct rna sequencing</li>
+                  <li><span class="property">about <a href=http://schema.org/about> [?] </a> </span>: <a href=#https://nanoporetech.com/> oxford nanopore technologies </a></li>
+                </ul>
+            </section>
🧹 Nitpick comments (4)
assets/ro-crate-preview.html (4)

210-212: Tidy encodingFormat + PRONOM links (remove dangling tags).

Two links are split by an extra closing and stray comma. Keep both links within a single list item.

Apply this diff:

-                    <ul><li><span class="property"> @id </span>: <a href=https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/BoostNano-master.zip> https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/BoostNano-master.zip </a></li><li><span class="property"> @type </span>: DataDownload</li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: application/zip</li>,  <a href=https://www.nationalarchives.gov.uk/PRONOM/x-fmt/263> https://www.nationalarchives.gov.uk/PRONOM/x-fmt/263 </a></li><li><span class="property">contentSize <a href=http://schema.org/contentSize> [?] </a> </span>: 2.44 MB</li></ul>                </section>
+                    <ul><li><span class="property"> @id </span>: <a href=https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/BoostNano-master.zip> https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/BoostNano-master.zip </a></li><li><span class="property"> @type </span>: DataDownload</li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: application/zip, <a href=https://www.nationalarchives.gov.uk/PRONOM/x-fmt/263> https://www.nationalarchives.gov.uk/PRONOM/x-fmt/263 </a></li><li><span class="property">contentSize <a href=http://schema.org/contentSize> [?] </a> </span>: 2.44 MB</li></ul>                </section>

148-167: Add missing “Back to top” control element (CSS exists, no element).

You define .back-to-top but don’t render the element or a #top target.

Apply this diff to render it and provide a target:

@@
     <body>
+        <a id="top"></a>
@@
-        <header>
+        <header>
@@
-        <footer>
+        <a class="back-to-top" href="#top" aria-label="Back to top" title="Back to top">↑</a>
+        <footer>

Optional: add a small script to show/hide on scroll; for a static page, always-visible is acceptable. Verify keyboard focusability and contrast.

Also applies to: 178-181, 312-317


186-186: Prefer HTTPS for vocabulary/help links (schema.org, purl.org, IANA).

These “[?]” anchors consistently use http:. Where available, switch to https: to avoid mixed-content warnings and improve security posture.

Recommendation:

Also applies to: 190-190, 193-193, 196-196, 199-199, 202-202, 205-205, 208-208, 211-211, 214-214, 217-217, 220-220, 223-223, 226-226, 229-229, 232-232, 235-235, 238-238, 241-241, 244-244, 247-247, 250-250, 253-253, 256-256, 259-259, 262-262, 265-265, 268-268, 271-271, 274-274, 277-277, 280-280, 283-283, 286-286, 289-289, 292-292, 295-295, 298-298, 301-301, 304-304, 307-307, 310-310


267-269: Consider normalizing id="mailto:..." to a simple fragment (nit).

HTML5 permits this, but colons and at-signs in id can complicate CSS/JS selectors. A safer fragment like id="contact-publisher" with the mailto in content improves robustness.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 59253db and 21093ed.

📒 Files selected for processing (3)
  • assets/ro-crate-metadata.json (1 hunks)
  • assets/ro-crate-preview.html (1 hunks)
  • src/index.php (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/index.php
  • assets/ro-crate-metadata.json
🧰 Additional context used
🪛 ast-grep (0.38.6)
assets/ro-crate-preview.html

[warning] 186-186: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 186-186: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 186-186: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 186-186: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 186-186: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 186-186: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 186-186: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 186-186: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 186-186: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 186-186: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 186-186: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 186-186: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 186-186: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 186-186: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 186-186: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 186-186: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 186-186: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 190-190: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 190-190: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 193-193: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 193-193: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 193-193: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 193-193: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 196-196: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 196-196: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 196-196: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 196-196: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 196-196: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 199-199: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 199-199: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 199-199: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 199-199: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 199-199: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 202-202: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 205-205: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 205-205: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 208-208: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 208-208: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 211-211: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 211-211: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 214-214: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 214-214: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 217-217: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 217-217: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 220-220: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 220-220: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 223-223: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 226-226: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 226-226: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 229-229: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 229-229: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 232-232: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 235-235: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 235-235: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 238-238: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 238-238: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 241-241: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 241-241: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 244-244: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 244-244: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 244-244: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 244-244: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 247-247: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 247-247: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 250-250: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 250-250: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 253-253: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 253-253: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 256-256: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 259-259: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 259-259: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 259-259: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 262-262: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 262-262: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 265-265: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 265-265: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 265-265: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 268-268: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 268-268: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 268-268: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 271-271: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 271-271: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 274-274: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 274-274: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 277-277: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 277-277: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 280-280: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 280-280: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 283-283: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 283-283: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 286-286: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 286-286: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 289-289: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 289-289: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 292-292: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 292-292: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 295-295: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 295-295: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 298-298: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 298-298: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 301-301: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 301-301: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 304-304: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 304-304: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 307-307: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 307-307: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 310-310: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)


[warning] 310-310: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
Context: [?]
Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
- https://cwe.mitre.org/data/definitions/319.html

(plaintext-http-link-html)

<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>RO-Crate Preview: Supporting data for &quot;Using synthetic RNA to benchmark poly(A) length inference from direct RNA sequencing.&quot;</title>
<link rel="stylesheet" href="">
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Remove empty stylesheet href (avoids 404 and render-blocking request).

An empty href will trigger a pointless HTTP request and can delay first paint.

Apply this diff:

-        <link rel="stylesheet" href="">
+        <!-- External stylesheet hook (intentionally left blank); remove or replace with a real file when available -->
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
<link rel="stylesheet" href="">
<!-- External stylesheet hook (intentionally left blank); remove or replace with a real file when available -->
🤖 Prompt for AI Agents
In assets/ro-crate-preview.html around line 7 there is a link tag with an empty
href which causes a useless 404/render-blocking request; remove the empty link
tag or replace it with a valid stylesheet href (or conditionally render the tag
only when a non-empty href is available) so no link element with href="" is
emitted.

…abbitai. Also, enabled the type to be a hyperlink resolved to the corresponding definition in 1.2 context, usually schema.org. And updated the preview of the gigadb example accordingly.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (7)
assets/ro-crate-preview.html (4)

7-7: Remove empty stylesheet href to avoid a pointless 404/render-blocking request.

Drop the empty link tag or replace it with a real stylesheet URL.

-        <link rel="stylesheet" href="">
+        <!-- External stylesheet hook removed; add a real stylesheet when available -->

204-218: Normalize section ids (drop leading “#”) and fix all corresponding hrefs (avoid “##”).

HTML id attributes must not include “#”. Keep “#” only in href.

- <section id="#oneExtra">
+ <section id="oneExtra">
...
-   <ul><li><span class="property"> @id </span>: #oneExtra</li> ...
+   <ul><li><span class="property"> @id </span>: #oneExtra</li> ...
</section>

- <section id="#twoExtra">
+ <section id="twoExtra">
...
-   <ul><li><span class="property"> @id </span>: #twoExtra</li> ...
+   <ul><li><span class="property"> @id </span>: #twoExtra</li> ...
</section>

- <section id="#zipExtra">
+ <section id="zipExtra">
...
-   <ul><li><span class="property"> @id </span>: #zipExtra</li> ...
+   <ul><li><span class="property"> @id </span>: #zipExtra</li> ...
</section>

- <section id="#other-files">
+ <section id="other-files">
...
-   <ul><li><span class="property"> @id </span>: #other-files</li> ...
+   <ul><li><span class="property"> @id </span>: #other-files</li> ...
</section>

And fix incoming links in earlier sections (examples already covered in the Root Data Entity diff) replacing:

  • href=##oneExtra → href="#oneExtra"
  • href=##twoExtra → href="#twoExtra"
  • href=##zipExtra → href="#zipExtra"
  • href=##other-files → href="#other-files"

186-186: Fix malformed list markup and broken anchors in Root Data Entity block.

The hasPart and author lines contain stray commas, extra , and “##” fragments. This breaks HTML semantics and navigation.

Apply the following minimal repair to normalize anchors and list markup (example for the visible items in this block):

- ... <li><span class="property">hasPart <a href=http://schema.org/hasPart> [?] </a> </span>: <a href=#https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/readme_102736.txt> readme_102736.txt </a></li>,  <a href=#https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/boostnano_no_dorado_R1_tails.csv> boostnano_no_dorado_R1_tails.csv </a>,  <a href=#https://gigadb.org/dataset/view/id/102736/Files_page/4> BoostNano-master </a>,  <a href=##other-files> other files </a><li><span class="property">author <a href=http://schema.org/author> [?] </a> </span>: <a href=#https://orcid.org/0000-0001-9083-6757> Chang JJ </a></li>,  <a href=##Xuan_Yang> Yang X </a>,  <a href=#https://orcid.org/0000-0003-0337-8722> Teng H </a>,  <a href=##Benjamin_Reames> Reames B </a>,  <a href=#https://orcid.org/0000-0003-1155-0959> Corbin V </a>,  <a href=#https://orcid.org/0000-0002-4300-455X> Coin LJM </a> ...
+ ... <li><span class="property">hasPart <a href=https://schema.org/hasPart> [?] </a> </span>:
+   <a href="#https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/readme_102736.txt">readme_102736.txt</a>,
+   <a href="#https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/boostnano_no_dorado_R1_tails.csv">boostnano_no_dorado_R1_tails.csv</a>,
+   <a href="#https://gigadb.org/dataset/view/id/102736/Files_page/4">BoostNano-master</a>,
+   <a href="#other-files">other files</a>
+ </li>
+ <li><span class="property">author <a href=https://schema.org/author> [?] </a> </span>:
+   <a href="#https://orcid.org/0000-0001-9083-6757">Chang JJ</a>,
+   <a href="#Xuan_Yang">Yang X</a>,
+   <a href="#https://orcid.org/0000-0003-0337-8722">Teng H</a>,
+   <a href="#Benjamin_Reames">Reames B</a>,
+   <a href="#https://orcid.org/0000-0003-1155-0959">Corbin V</a>,
+   <a href="#https://orcid.org/0000-0002-4300-455X">Coin LJM</a>
+ </li> ...

Follow-up:

  • Replace remaining href=##... with href="#...".
  • Ensure every hasPart/author href points to an existing section id.

203-203: Merge stray @type “Dataset” into a single list item.

Currently rendered as a dangling anchor and closing tag. Keep both types in one

  • .

    - <ul><li><span class="property"> @id </span>: <a href=https://doi.org/10.5524/100425> https://doi.org/10.5524/100425 </a></li><li><span class="property"> @type </span>: <a href=http://schema.org/CreativeWork> CreativeWork </a></li>,  <a href=http://schema.org/Dataset> Dataset </a><li><span class="property">conformsTo <a href=http://purl.org/dc/terms/conformsTo> [?] </a> </span>: <a href=https://w3id.org/ro/crate> https://w3id.org/ro/crate </a></li></ul>
    + <ul><li><span class="property"> @id </span>: <a href=https://doi.org/10.5524/100425> https://doi.org/10.5524/100425 </a></li><li><span class="property"> @type </span>: <a href=https://schema.org/CreativeWork> CreativeWork</a>, <a href=https://schema.org/Dataset>Dataset</a></li><li><span class="property">conformsTo <a href=https://purl.org/dc/terms/conformsTo> [?] </a> </span>: <a href=https://w3id.org/ro/crate> https://w3id.org/ro/crate </a></li></ul>
  • resources/ro-crate-preview-out.html (3)

    7-7: Remove empty stylesheet href.

    Avoids a useless request and render-blocking.

    -        <link rel="stylesheet" href="">
    +        <!-- Stylesheet reference removed; add when available -->

    227-233: Empty headings and names — add fallback to @id basename in generator.

    Sections render with empty

    and empty “name”. Update generator to use a non-empty fallback.

    Proposed generator behavior (already outlined previously):

    • If name is missing/blank, use basename(@id) for headings and displayed name, and optionally badge “(name missing)”.

    Do you want me to send a patch to src/rocrate/ROCratePreviewGenerator.php implementing displayName() and swapping the usages?


    305-337: Fix incorrect encodingFormat values for TSV/TXT/HTML/BIOM.

    Use standards-compliant MIME types:

    • .tsv → text/tab-separated-values
    • .txt → text/plain
    • .html → text/html
    • .biom (HDF5) → application/x-hdf5 (or application/octet-stream)
    • .biom (JSON) → application/biom+json
    - <li><span class="property">encodingFormat ... </span>: application/json-ld</li>  <!-- SSU/LSU *.tsv -->
    + <li><span class="property">encodingFormat ... </span>: text/tab-separated-values</li>
    
    - <li><span class="property">encodingFormat ... </span>: application/json-ld</li>  <!-- SSU/LSU *.txt -->
    + <li><span class="property">encodingFormat ... </span>: text/plain</li>
    
    - <li><span class="property">encodingFormat ... </span>: application/html</li>      <!-- krona.html -->
    + <li><span class="property">encodingFormat ... </span>: text/html</li>
    
    - <li><span class="property">encodingFormat ... </span>: application/json-ld</li>  <!-- *_hdf5.biom -->
    + <li><span class="property">encodingFormat ... </span>: application/x-hdf5</li>
    
    - <li><span class="property">encodingFormat ... </span>: application/json-ld</li>  <!-- *_json.biom -->
    + <li><span class="property">encodingFormat ... </span>: application/biom+json</li>

    🧹 Nitpick comments (5)
    assets/ro-crate-preview.html (2)

    193-197: Use correct MIME for readme: text/plain (not text/txt).

    Align with standard MIME types for better interoperability.

    - <li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: text/txt</li>
    + <li><span class="property">encodingFormat <a href=https://schema.org/encodingFormat> [?] </a> </span>: text/plain</li>

    185-218: Prefer HTTPS for vocab links (schema.org, purl, iana) in doc tooltips.

    Most of these endpoints support HTTPS; switching avoids mixed-content warnings.

    Example replacements:

    resources/ro-crate-metadata-out.json (1)

    344-349: Invalid type “URL” — use a concrete Web type (e.g., WebSite) instead.

    In schema.org, URL is a data type, not a Thing. For a homepage, WebSite or WebPage fits better.

    -            "@type": [
    -                "URL"
    -            ],
    +            "@type": [
    +                "WebSite"
    +            ],

    If this object represents something else (organization, dataset landing page), pick the appropriate class accordingly.

    resources/ro-crate-preview-out.html (2)

    372-379: Fix typos: “Geno Ontology” → “Gene Ontology”; “statistcs” → “statistics”.

    User-facing copy; polish for clarity.

    - <h2>Geno Ontology summary statistics</h2>
    + <h2>Gene Ontology summary statistics</h2>
    ...
    - Pfam summary statistcs
    + Pfam summary statistics

    185-397: Prefer HTTPS for schema/purl links in tooltips.

    As with the other preview, consider updating http://schema.org and http://purl.org URLs to HTTPS to avoid mixed content warnings.

    Comment on lines +140 to +144
    "encodingFormat": "text/txt",
    "exifData": {
    "@id": "#oneExtra"
    }
    },
    Copy link

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    🛠️ Refactor suggestion

    Fix encodingFormat for readme_102736.txt to text/plain.

    "text/txt" is non-standard.

    -            "encodingFormat": "text/txt",
    +            "encodingFormat": "text/plain",
    📝 Committable suggestion

    ‼️ IMPORTANT
    Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

    Suggested change
    "encodingFormat": "text/txt",
    "exifData": {
    "@id": "#oneExtra"
    }
    },
    "encodingFormat": "text/plain",
    "exifData": {
    "@id": "#oneExtra"
    }
    },
    🤖 Prompt for AI Agents
    In resources/ro-crate-metadata-out.json around lines 140 to 144, the
    encodingFormat for readme_102736.txt is set to the non-standard "text/txt";
    change its value to the correct MIME type "text/plain" (update the JSON value
    only), and run a quick JSON lint or MIME validation to ensure the metadata
    remains valid.
    

    Comment on lines +253 to +294
    <ul><li><span class="property"> @id </span>: results/sequence-categorisation/5_8S.fa.gz</li><li><span class="property"> @type </span>: File</li><li><span class="property">description <a href=http://schema.org/description> [?] </a> </span>: Identify specific loci in the sample.</li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: application/zip</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: Sequence categorisation</li></ul> </section>
    <section id="results/sequence-categorisation/LSU_rRNA_archaea.RF02540.fa.gz">
    <h2>Sequence categorisation</h2>
    <ul><li><span class="property"> @id </span>: results/sequence-categorisation/LSU_rRNA_archaea.RF02540.fa.gz</li><li><span class="property"> @type </span>: File</li><li><span class="property">description <a href=http://schema.org/description> [?] </a> </span>: Identify specific loci in the sample.</li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: application/zip</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: Sequence categorisation</li></ul> </section>
    <section id="results/sequence-categorisation/LSU_rRNA_bacteria.RF02541.fa.gz">
    <h2>Sequence categorisation</h2>
    <ul><li><span class="property"> @id </span>: results/sequence-categorisation/LSU_rRNA_bacteria.RF02541.fa.gz</li><li><span class="property"> @type </span>: File</li><li><span class="property">description <a href=http://schema.org/description> [?] </a> </span>: Identify specific loci in the sample.</li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: application/zip</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: Sequence categorisation</li></ul> </section>
    <section id="results/sequence-categorisation/LSU_rRNA_eukarya.RF02543.fa.gz">
    <h2>Sequence categorisation</h2>
    <ul><li><span class="property"> @id </span>: results/sequence-categorisation/LSU_rRNA_eukarya.RF02543.fa.gz</li><li><span class="property"> @type </span>: File</li><li><span class="property">description <a href=http://schema.org/description> [?] </a> </span>: Identify specific loci in the sample.</li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: application/zip</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: Sequence categorisation</li></ul> </section>
    <section id="results/sequence-categorisation/SSU_rRNA_archaea.RF01959.fa.gz">
    <h2>Sequence categorisation</h2>
    <ul><li><span class="property"> @id </span>: results/sequence-categorisation/SSU_rRNA_archaea.RF01959.fa.gz</li><li><span class="property"> @type </span>: File</li><li><span class="property">description <a href=http://schema.org/description> [?] </a> </span>: Identify specific loci in the sample.</li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: application/zip</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: Sequence categorisation</li></ul> </section>
    <section id="results/sequence-categorisation/SSU_rRNA_bacteria.RF00177.fa.gz">
    <h2>Sequence categorisation</h2>
    <ul><li><span class="property"> @id </span>: results/sequence-categorisation/SSU_rRNA_bacteria.RF00177.fa.gz</li><li><span class="property"> @type </span>: File</li><li><span class="property">description <a href=http://schema.org/description> [?] </a> </span>: Identify specific loci in the sample.</li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: application/zip</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: Sequence categorisation</li></ul> </section>
    <section id="results/sequence-categorisation/SSU_rRNA_eukarya.RF01960.fa.gz">
    <h2>Sequence categorisation</h2>
    <ul><li><span class="property"> @id </span>: results/sequence-categorisation/SSU_rRNA_eukarya.RF01960.fa.gz</li><li><span class="property"> @type </span>: File</li><li><span class="property">description <a href=http://schema.org/description> [?] </a> </span>: Identify specific loci in the sample.</li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: application/zip</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: Sequence categorisation</li></ul> </section>
    <section id="results/sequence-categorisation/Bacteria_large_SRP.RF01854.fasta.gz">
    <h2>Sequence categorisation</h2>
    <ul><li><span class="property"> @id </span>: results/sequence-categorisation/Bacteria_large_SRP.RF01854.fasta.gz</li><li><span class="property"> @type </span>: File</li><li><span class="property">description <a href=http://schema.org/description> [?] </a> </span>: Identify specific loci in the sample.</li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: application/zip</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: Sequence categorisation</li></ul> </section>
    <section id="results/sequence-categorisation/Bacteria_small_SRP.RF00169.fasta.gz">
    <h2>Sequence categorisation</h2>
    <ul><li><span class="property"> @id </span>: results/sequence-categorisation/Bacteria_small_SRP.RF00169.fasta.gz</li><li><span class="property"> @type </span>: File</li><li><span class="property">description <a href=http://schema.org/description> [?] </a> </span>: Identify specific loci in the sample.</li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: application/zip</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: Sequence categorisation</li></ul> </section>
    <section id="results/sequence-categorisation/RNaseP_bact_a.RF00010.fasta.gz">
    <h2>Sequence categorisation</h2>
    <ul><li><span class="property"> @id </span>: results/sequence-categorisation/RNaseP_bact_a.RF00010.fasta.gz</li><li><span class="property"> @type </span>: File</li><li><span class="property">description <a href=http://schema.org/description> [?] </a> </span>: Identify specific loci in the sample.</li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: application/zip</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: Sequence categorisation</li></ul> </section>
    <section id="results/sequence-categorisation/alpha_tmRNA.RF01849.fasta.gz">
    <h2>Sequence categorisation</h2>
    <ul><li><span class="property"> @id </span>: results/sequence-categorisation/alpha_tmRNA.RF01849.fasta.gz</li><li><span class="property"> @type </span>: File</li><li><span class="property">description <a href=http://schema.org/description> [?] </a> </span>: Identify specific loci in the sample.</li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: application/zip</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: Sequence categorisation</li></ul> </section>
    <section id="results/sequence-categorisation/cyano_tmRNA.RF01851.fasta.gz">
    <h2>Sequence categorisation</h2>
    <ul><li><span class="property"> @id </span>: results/sequence-categorisation/cyano_tmRNA.RF01851.fasta.gz</li><li><span class="property"> @type </span>: File</li><li><span class="property">description <a href=http://schema.org/description> [?] </a> </span>: Identify specific loci in the sample.</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: Sequence categorisation</li></ul> </section>
    <section id="results/sequence-categorisation/tRNA-Sec.RF01852.fasta.gz">
    <h2>Sequence categorisation</h2>
    <ul><li><span class="property"> @id </span>: results/sequence-categorisation/tRNA-Sec.RF01852.fasta.gz</li><li><span class="property"> @type </span>: File</li><li><span class="property">description <a href=http://schema.org/description> [?] </a> </span>: Identify specific loci in the sample.</li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: application/zip</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: Sequence categorisation</li></ul> </section>
    <section id="results/sequence-categorisation/tRNA.RF00005.fasta.gz">
    <h2>Sequence categorisation</h2>
    <ul><li><span class="property"> @id </span>: results/sequence-categorisation/tRNA.RF00005.fasta.gz</li><li><span class="property"> @type </span>: File</li><li><span class="property">description <a href=http://schema.org/description> [?] </a> </span>: Identify specific loci in the sample.</li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: application/zip</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: Sequence categorisation</li></ul> </section>
    <section id="results/sequence-categorisation/tmRNA.RF00023.fasta.gz">
    <h2>Sequence categorisation</h2>
    Copy link

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    ⚠️ Potential issue

    Correct MIME types: .gz files are gzip, not zip; fix typos in labels.

    • Change encodingFormat for .fa.gz/.tsv.gz to application/gzip.
    • Fix “indentification” → “identification”.
    • SSU file labeled “LSU” — correct the wording for the SSU entry.
    - <li><span class="property">encodingFormat ... </span>: application/zip</li>
    + <li><span class="property">encodingFormat ... </span>: application/gzip</li>
    
    - <h2>LSU sequences used for indentification</h2>
    + <h2>SSU sequences used for identification</h2>
    - <li><span class="property">name ... </span>: LSU sequences used for indentification</li>
    + <li><span class="property">name ... </span>: SSU sequences used for identification</li>
    
    - <h2>LSU sequences used for indentification</h2>
    + <h2>LSU sequences used for identification</h2>
    - <li><span class="property">name ... </span>: LSU sequences used for indentification</li>
    + <li><span class="property">name ... </span>: LSU sequences used for identification</li>

    Optional: expose both “encodingFormat: text/fasta” and a separate “contentEncoding: gzip” in metadata sources if you control generation.

    Also applies to: 356-361, 303-304, 321-322

    🧰 Tools
    🪛 ast-grep (0.38.6)

    [warning] 255-255: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 255-255: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 255-255: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 258-258: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 258-258: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 258-258: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 261-261: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 261-261: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 261-261: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 264-264: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 264-264: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 264-264: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 267-267: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 267-267: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 267-267: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 270-270: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 270-270: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 270-270: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 273-273: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 273-273: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 273-273: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 276-276: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 276-276: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 276-276: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 279-279: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 279-279: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 279-279: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 282-282: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 282-282: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 282-282: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 285-285: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 285-285: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 288-288: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 288-288: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 288-288: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 291-291: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 291-291: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 291-291: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 294-294: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 294-294: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 294-294: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)

    🤖 Prompt for AI Agents
    In resources/ro-crate-preview-out.html around lines 253-294 (also apply same
    changes at 303-304, 321-322, 356-361), replace incorrect encodingFormat values
    for all *.fa.gz, *.fasta.gz and *.tsv.gz entries from application/zip to
    application/gzip, correct the typo “indentification” to “identification”
    wherever present, and fix the mislabeled SSU entry currently shown as “LSU” so
    its heading/text correctly references SSU; optionally, if you control metadata
    generation, emit both the uncompressed media type (e.g., text/fasta or
    text/tab-separated-values) and a separate contentEncoding: gzip field instead of
    using application/gzip alone.
    

    Comment on lines +395 to +397
    <section id="#workflow/metaGOflow">
    <h2>metaGOflow</h2>
    <ul><li><span class="property"> @id </span>: #workflow/metaGOflow</li><li><span class="property"> @type </span>: File</li>, SoftwareSourceCode</li>, ComputationalWorkflow</li><li><span class="property">affiliation <a href=http://schema.org/affiliation> [?] </a> </span>: University of Flatland</li><li><span class="property">author <a href=http://schema.org/author> [?] </a> </span>: EMO BON</li><li><span class="property">hasPart <a href=http://schema.org/hasPart> [?] </a> </span>: <a href=#config.yml> MetaGOflow configuration file </a></li>, TARA_OCEANS_SAMPLE_3steps/ERR599171.yml</li><li><span class="property">license <a href=http://schema.org/license> [?] </a> </span>: <a href=#https://www.apache.org/licenses/LICENSE-2.0> Apache License 2.0 </a></li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: metaGOflow</li><li><span class="property">url <a href=http://schema.org/url> [?] </a> </span>: <a href=https://github.com/emo-bon/MetaGOflow/releases/tag/v1.0.0> https://github.com/emo-bon/MetaGOflow/releases/tag/v1.0.0 </a></li></ul> </section>
    Copy link

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    🛠️ Refactor suggestion

    Fix broken @type list markup for workflow and normalize hasPart items.

    The @type list renders as “File, SoftwareSourceCode, ComputationalWorkflow”. Keep them in a single list item, and ensure hasPart values are linked consistently.

    - <ul><li><span class="property"> @id </span>: #workflow/metaGOflow</li><li><span class="property"> @type </span>: File</li>,  SoftwareSourceCode</li>,  ComputationalWorkflow</li><li><span class="property">affiliation ... </span>: University of Flatland</li><li><span class="property">author ... </span>: EMO BON</li><li><span class="property">hasPart ... </span>: <a href=#config.yml> MetaGOflow configuration file </a></li>,  TARA_OCEANS_SAMPLE_3steps/ERR599171.yml</li><li><span class="property">license ... </span>: <a href=#https://www.apache.org/licenses/LICENSE-2.0> Apache License 2.0 </a></li><li><span class="property">name ... </span>: metaGOflow</li><li><span class="property">url ... </span>: <a href=https://github.com/emo-bon/MetaGOflow/releases/tag/v1.0.0> https://github.com/emo-bon/MetaGOflow/releases/tag/v1.0.0 </a></li></ul>
    + <ul>
    +   <li><span class="property"> @id </span>: #workflow/metaGOflow</li>
    +   <li><span class="property"> @type </span>: File, SoftwareSourceCode, ComputationalWorkflow</li>
    +   <li><span class="property">affiliation ... </span>: University of Flatland</li>
    +   <li><span class="property">author ... </span>: EMO BON</li>
    +   <li><span class="property">hasPart ... </span>: <a href="#config.yml">MetaGOflow configuration file</a>, <a href="#ERR599171.yml">TARA_OCEANS_SAMPLE_3steps/ERR599171.yml</a></li>
    +   <li><span class="property">license ... </span>: <a href="#https://www.apache.org/licenses/LICENSE-2.0">Apache License 2.0</a></li>
    +   <li><span class="property">name ... </span>: metaGOflow</li>
    +   <li><span class="property">url ... </span>: <a href="https://github.com/emo-bon/MetaGOflow/releases/tag/v1.0.0">https://github.com/emo-bon/MetaGOflow/releases/tag/v1.0.0</a></li>
    + </ul>
    📝 Committable suggestion

    ‼️ IMPORTANT
    Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

    Suggested change
    <section id="#workflow/metaGOflow">
    <h2>metaGOflow</h2>
    <ul><li><span class="property"> @id </span>: #workflow/metaGOflow</li><li><span class="property"> @type </span>: File</li>, SoftwareSourceCode</li>, ComputationalWorkflow</li><li><span class="property">affiliation <a href=http://schema.org/affiliation> [?] </a> </span>: University of Flatland</li><li><span class="property">author <a href=http://schema.org/author> [?] </a> </span>: EMO BON</li><li><span class="property">hasPart <a href=http://schema.org/hasPart> [?] </a> </span>: <a href=#config.yml> MetaGOflow configuration file </a></li>, TARA_OCEANS_SAMPLE_3steps/ERR599171.yml</li><li><span class="property">license <a href=http://schema.org/license> [?] </a> </span>: <a href=#https://www.apache.org/licenses/LICENSE-2.0> Apache License 2.0 </a></li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: metaGOflow</li><li><span class="property">url <a href=http://schema.org/url> [?] </a> </span>: <a href=https://github.com/emo-bon/MetaGOflow/releases/tag/v1.0.0> https://github.com/emo-bon/MetaGOflow/releases/tag/v1.0.0 </a></li></ul> </section>
    <section id="#workflow/metaGOflow">
    <h2>metaGOflow</h2>
    <ul>
    <li><span class="property"> @id </span>: #workflow/metaGOflow</li>
    <li><span class="property"> @type </span>: File, SoftwareSourceCode, ComputationalWorkflow</li>
    <li><span class="property">affiliation <a href="http://schema.org/affiliation"> [?] </a> </span>: University of Flatland</li>
    <li><span class="property">author <a href="http://schema.org/author"> [?] </a> </span>: EMO BON</li>
    <li><span class="property">hasPart <a href="http://schema.org/hasPart"> [?] </a> </span>:
    <a href="#config.yml">MetaGOflow configuration file</a>,
    <a href="#TARA_OCEANS_SAMPLE_3steps/ERR599171.yml">TARA_OCEANS_SAMPLE_3steps/ERR599171.yml</a>
    </li>
    <li><span class="property">license <a href="http://schema.org/license"> [?] </a> </span>:
    <a href="https://www.apache.org/licenses/LICENSE-2.0">Apache License 2.0</a>
    </li>
    <li><span class="property">name <a href="http://schema.org/name"> [?] </a> </span>: metaGOflow</li>
    <li><span class="property">url <a href="http://schema.org/url"> [?] </a> </span>:
    <a href="https://github.com/emo-bon/MetaGOflow/releases/tag/v1.0.0">
    https://github.com/emo-bon/MetaGOflow/releases/tag/v1.0.0
    </a>
    </li>
    </ul>
    </section>
    🧰 Tools
    🪛 ast-grep (0.38.6)

    [warning] 396-396: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 396-396: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 396-396: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 396-396: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 396-396: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 396-396: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)

    Copy link

    @coderabbitai coderabbitai bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Actionable comments posted: 5

    ♻️ Duplicate comments (13)
    src/rocrate/ROCratePreviewGenerator.php (9)

    356-356: Replace magic delimiter with a class constant

    Using a hard-coded delimiter string harms maintainability. Define a constant and reuse it.

    Apply this diff to declare the constant:

     class ROCratePreviewGenerator
     {
    +    private const VALUE_DELIMITER = ' %%$$%%$$** ';

    Then:

    -            $values = explode(' %%$$%%$$** ', $valStr);
    +            $values = explode(self::VALUE_DELIMITER, $valStr);
    -            return implode(' %%$$%%$$** ', $values);
    +            return implode(self::VALUE_DELIMITER, $values);

    Also applies to: 501-501, 10-10


    5-5: Importing ROCrateException from the wrong namespace

    The exception class lives under ROCrate\Exceptions. This will cause a fatal error at runtime.

    Apply this diff:

    -use Exceptions\ROCrateException;
    +use ROCrate\Exceptions\ROCrateException;

    12-12: Avoid public static state; encapsulate rootId

    A public static mutable property breaks encapsulation and is hard to reason about in concurrent or multi-run contexts.

    Apply this diff and update call sites accordingly:

    -class ROCratePreviewGenerator
    +class ROCratePreviewGenerator
     {
    -    public static $rootId = './';
    +    private static string $rootId = './';
    +
    +    public static function getRootId(): string
    +    {
    +        return self::$rootId;
    +    }
     }

    23-27: Defining dynamic globals inside a method is unsafe and non-reentrant

    define() creates process-wide constants that cannot be redefined. Re-running generatePreview() with a different directory will either fail or use stale paths. Use local variables and pass them through.

    Apply this diff:

    -        // Configuration
    -        define('INPUT_JSON', $basePath . '/ro-crate-metadata.json');
    -        define('OUTPUT_HTML', $basePath . '/ro-crate-preview.html');
    -        define('CSS_PATH', '/ro-crate-preview_files/style.css');
    +        // Configuration (locals, not globals)
    +        $inputJson  = $basePath . '/ro-crate-metadata.json';
    +        $outputHtml = $basePath . '/ro-crate-preview.html';
    +        $cssPath    = '/ro-crate-preview_files/style.css';

    29-38: Replace die() with exceptions and use local config variables

    Aborting the process makes the code unusable as a library and complicates error handling. Also stop referring to removed constants.

    Apply this diff:

    -        // Load and validate JSON
    -        if (!file_exists(INPUT_JSON)) {
    -            die("Error: " . INPUT_JSON . " not found");
    -        }
    -
    -        $json = file_get_contents(INPUT_JSON);
    -        $data = json_decode($json, true);
    -
    -        if (json_last_error() !== JSON_ERROR_NONE) {
    -            die("Error: Invalid JSON - " . json_last_error_msg());
    -        }
    +        // Load and validate JSON
    +        if (!file_exists($inputJson)) {
    +            throw new ROCrateException("Error: $inputJson not found");
    +        }
    +        $json = @file_get_contents($inputJson);
    +        if ($json === false) {
    +            throw new ROCrateException("Error: Failed to read $inputJson");
    +        }
    +        $data = json_decode($json, true);
    +        if (json_last_error() !== JSON_ERROR_NONE) {
    +            throw new ROCrateException("Error: Invalid JSON - " . json_last_error_msg());
    +        }

    95-111: Remove global $rootId; use the class property consistently

    Using global undermines encapsulation and can lead to subtle bugs. Store the resolved id in the class property and read from it later.

    Apply this diff:

    -            if ($conditionOne && $conditionTwo) {
    -                global $rootId;
    -                $rootId = $entityData['about']['@id'];
    -                break;
    -            }
    +            if ($conditionOne && $conditionTwo) {
    +                // Guard access to nested keys
    +                $aboutId = $entityData['about']['@id'] ?? null;
    +                if (is_string($aboutId) && $aboutId !== '') {
    +                    self::$rootId = $aboutId;
    +                }
    +                break;
    +            }
    @@
    -        foreach ($entities as $entity) {
    -            if (($entity['@id'] ?? '') === $rootId) {
    +        foreach ($entities as $entity) {
    +            if (($entity['@id'] ?? '') === self::$rootId) {
                     return $entity;
                 }
             }

    And:

    -            <?php foreach ($entities as $id => $entity) :
    -                global $rootId;
    -                if ($id === $rootId) {
    +            <?php foreach ($entities as $id => $entity) :
    +                if ($id === self::$rootId) {
                         continue;
                     } ?>

    Also applies to: 315-319


    121-132: Fix CSS include: file_exists uses filesystem path, href needs web path

    The current check always fails because CSS_PATH is a web path. Compute both filesystem and href paths using $basePath.

    Apply this diff:

    -        <title>RO-Crate Preview: <?= htmlspecialchars($rootEntity['name'] ?? 'Untitled') ?></title>
    -        <link rel="stylesheet" href="<?= file_exists(CSS_PATH) ? CSS_PATH : '' ?>">
    +        <title>RO-Crate Preview: <?= htmlspecialchars($rootEntity['name'] ?? 'Untitled') ?></title>
    +        <?php
    +            $cssFile = $basePath . '/ro-crate-preview_files/style.css';
    +            $cssHref = file_exists($cssFile) ? '/ro-crate-preview_files/style.css' : '';
    +        ?>
    +        <link rel="stylesheet" href="<?= htmlspecialchars($cssHref, ENT_QUOTES) ?>">

    312-323: Make render methods instance methods and call them via $this

    Static calls here are inconsistent with instance state usage and make testing harder. Convert renderEntity/renderKey/renderValue to instance methods and update all call sites.

    Apply these diffs:

    -                <?= ROCratePreviewGenerator::renderEntity($rootEntity, $entities, $termUris, $basePath) ?>
    +                <?= $this->renderEntity($rootEntity, $entities, $termUris, $basePath) ?>
    @@
    -                    <?= ROCratePreviewGenerator::renderEntity($entity, $entities, $termUris, $basePath) ?>
    +                    <?= $this->renderEntity($entity, $entities, $termUris, $basePath) ?>
    -    public static function renderEntity($entity, $entities, $termUris, $basePath, $depth = 0)
    +    public function renderEntity($entity, $entities, $termUris, $basePath, $depth = 0)
    @@
    -            $keyHtml = ROCratePreviewGenerator::renderKey($key, $termUris);
    -            $valStr = ROCratePreviewGenerator::renderValue($value, $entities, $termUris, $basePath, $depth);
    +            $keyHtml = $this->renderKey($key, $termUris);
    +            $valStr = $this->renderValue($value, $entities, $termUris, $basePath, $depth);
    -    public static function renderKey($key, $termUris): string
    +    public function renderKey($key, $termUris): string
    -    public static function renderValue($value, $entities, $termUris, $basePath, $depth): string
    +    public function renderValue($value, $entities, $termUris, $basePath, $depth): string
    @@
    -                $values[] = ROCratePreviewGenerator::renderValue($item, $entities, $termUris, $basePath, $depth);
    +                $values[] = $this->renderValue($item, $entities, $termUris, $basePath, $depth);
    @@
    -                        ROCratePreviewGenerator::renderEntity($target, $entities, $termUris, $basePath, $depth + 1)
    +                        $this->renderEntity($target, $entities, $termUris, $basePath, $depth + 1)

    Also applies to: 345-355, 494-501, 520-535


    359-362: Guard context.jsonld access and JSON decode

    File may be missing or invalid; current code will emit warnings/notices and break.

    Apply this diff:

    -            // if we can resolve the key from the default context, we attach [?] hyperlink
    -            $contextData = json_decode(file_get_contents($basePath . "/context.jsonld"), true)['@context'];
    +            // Resolve key from default context (optional)
    +            $contextPath = $basePath . "/context.jsonld";
    +            $contextData = [];
    +            if (is_file($contextPath) && is_readable($contextPath)) {
    +                $ctxRaw = file_get_contents($contextPath);
    +                if ($ctxRaw !== false) {
    +                    $ctxDecoded = json_decode($ctxRaw, true);
    +                    if (json_last_error() === JSON_ERROR_NONE && isset($ctxDecoded['@context']) && is_array($ctxDecoded['@context'])) {
    +                        $contextData = $ctxDecoded['@context'];
    +                    }
    +                }
    +            }
    src/rocrate/ROCrate.php (4)

    36-48: Constructor parameters are cryptic; prefer descriptive names

    $lEFlag/$aFlag/$pFlag reduce readability and make call sites error-prone.

    Apply this diff:

    -    public function __construct(string $directory, bool $lEFlag = false, bool $aFlag = true, bool $pFlag = false)
    +    public function __construct(string $directory, bool $loadExisting = false, bool $attached = true, bool $preview = false)
         {
    -        $loadExisting = $lEFlag;
    -        $attachedFlag = $aFlag;
    -        $previewFlag = $pFlag;
    -
    -        $this->attached = $attachedFlag;
    -        $this->preview = $previewFlag;
    +        $this->attached = $attached;
    +        $this->preview = $preview;

    Also update the PHPDoc accordingly.


    211-227: Duplicate property-normalization logic — extract helper

    This block duplicates the normalization done in initializeNewCrate(). Extract a private method and call it in both places.

    Suggested helper (outside this hunk):

    private function normalizeEntityProperties(): void
    {
        foreach ($this->entities as $entity) {
            foreach (array_keys($entity->getProperties()) as $key) {
                $prop = $entity->getProperties()[$key];
                if (is_array($prop)) {
                    if (array_keys($prop) !== range(0, count($prop) - 1)) {
                        $entity->addProperty($key, [$prop]); // wrap assoc in array
                    }
                } else {
                    $entity->addProperty($key, [$prop]); // wrap literal
                }
            }
        }
    }

    Then replace the duplicated loops with $this->normalizeEntityProperties();


    11-11: Importing ROCrateException from the wrong namespace

    Fix the namespace to avoid runtime failures.

    Apply this diff:

    -use Exceptions\ROCrateException;
    +use ROCrate\Exceptions\ROCrateException;

    907-920: Incorrect day validation for YYYY-MM-DD dates

    You’re checking day against month array and using odd bounds. Use PHP’s checkdate() for correctness.

    Apply this diff:

    -                $year = substr($dateString, 0, 4);
    -                $month = substr($dateString, 5, 2);
    -                $day = substr($dateString, 8, 2);
    -                if (!ctype_digit($year)) {
    -                    $flag = false;
    -                } elseif (!in_array($month, $MM)) {
    -                    $flag = false;
    -                } elseif (!in_array($day, $MM)) {
    -                    if (!ctype_digit($year)) {
    -                        $flag = false;
    -                    } elseif (((int)$day < 13) || ((int)$day > 31)) {
    -                        $flag = false;
    -                    }
    -                }
    +                $year = substr($dateString, 0, 4);
    +                $month = substr($dateString, 5, 2);
    +                $day = substr($dateString, 8, 2);
    +                if (!ctype_digit($year) || !in_array($month, $MM) || !ctype_digit($day)) {
    +                    $flag = false;
    +                } elseif (!checkdate((int)$month, (int)$day, (int)$year)) {
    +                    $flag = false;
    +                }
    🧹 Nitpick comments (5)
    src/rocrate/ROCratePreviewGenerator.php (1)

    80-87: Indexing entities without guarding '@id' can cause notices

    If any graph entry lacks @id, you'll get undefined index. Safer to skip or throw.

    Apply this diff:

    -        foreach ($graph as $entity) {
    -            $index[$entity['@id']] = $entity;
    -        }
    +        foreach ($graph as $entity) {
    +            if (!isset($entity['@id']) || !is_string($entity['@id'])) {
    +                // Skip malformed entries or throw if that's preferred:
    +                // throw new ROCrateException("Entity missing @id in @graph");
    +                continue;
    +            }
    +            $index[$entity['@id']] = $entity;
    +        }
    src/rocrate/ROCrate.php (4)

    131-136: Docblock throws annotation points to the wrong namespace

    Align the DocBlock with the actual exception namespace you use/import.

    Apply this diff:

    -     * @throws \Exceptions\ROCrateException Exceptions with specific messages to indicate possible errors
    +     * @throws \ROCrate\Exceptions\ROCrateException Exceptions with specific messages to indicate possible errors

    157-166: Typo and minor robustness in metadata parsing

    • Typo: $condtionOne
    • Consider guarding nested keys.

    Apply this diff:

    -        foreach ($json['@graph'] as $entityData) {
    -            $condtionOne = str_contains($entityData['@id'], "ro-crate-metadata.json");
    +        foreach ($json['@graph'] as $entityData) {
    +            $conditionOne = isset($entityData['@id']) && str_contains($entityData['@id'], "ro-crate-metadata.json");
                 $conditionTwo = array_key_exists("conformsTo", $entityData);
    -            if ($condtionOne && $conditionTwo) {
    -                $conformsTo = $entityData["conformsTo"]["@id"];
    -                $rootId = $entityData['about']['@id'];
    +            if ($conditionOne && $conditionTwo) {
    +                $conformsTo = $entityData["conformsTo"]["@id"] ?? 'https://w3id.org/ro/crate/1.2';
    +                $rootId = $entityData['about']['@id'] ?? './';
                     $this->addProfile($conformsTo, $rootId);
                     continue;
                 }

    788-793: Stray semicolons after array_merge

    Harmless but noisy.

    Apply this diff:

    -        $graph = array_merge($graph, $second);
    -        ;
    -        $graph = array_merge($graph, $last);
    -        ;
    +        $graph = array_merge($graph, $second);
    +        $graph = array_merge($graph, $last);

    541-555: Use strict in_array for actionStatus

    Strings comparisons should be strict to avoid loose type coercion (future-proofing).

    Apply this diff:

    -                if (!in_array($actionStatus, $validStatuses)) {
    +                if (!in_array($actionStatus, $validStatuses, true)) {
                         $errors[] = "An action's actionStatus property is invalid.";
                     }
    📜 Review details

    Configuration used: CodeRabbit UI
    Review profile: CHILL
    Plan: Pro

    💡 Knowledge Base configuration:

    • MCP integration is disabled by default for public repositories
    • Jira integration is disabled by default for public repositories
    • Linear integration is disabled by default for public repositories

    You can enable these sources in your CodeRabbit configuration.

    📥 Commits

    Reviewing files that changed from the base of the PR and between 788261a and 79774c3.

    📒 Files selected for processing (2)
    • src/rocrate/ROCrate.php (1 hunks)
    • src/rocrate/ROCratePreviewGenerator.php (1 hunks)
    🧰 Additional context used
    🧬 Code Graph Analysis (2)
    src/rocrate/ROCratePreviewGenerator.php (1)
    src/exceptions/ROCrateException.php (1)
    • ROCrateException (8-10)
    src/rocrate/ROCrate.php (13)
    src/exceptions/ROCrateException.php (1)
    • ROCrateException (8-10)
    src/rocrate/Entity.php (11)
    • Entity (8-286)
    • __construct (20-24)
    • toArray (266-266)
    • baseArray (272-285)
    • addProperty (119-123)
    • getProperties (107-110)
    • getId (30-33)
    • setCrate (257-260)
    • getTypes (50-53)
    • getProperty (98-101)
    • addPropertyPair (132-181)
    src/rocrate/DataEntity.php (3)
    • DataEntity (12-33)
    • __construct (19-22)
    • toArray (28-32)
    src/rocrate/Dataset.php (3)
    • Dataset (10-30)
    • __construct (16-19)
    • toArray (25-29)
    src/rocrate/File.php (2)
    • File (10-20)
    • __construct (16-19)
    src/rocrate/Descriptor.php (3)
    • Descriptor (10-29)
    • __construct (16-19)
    • toArray (25-28)
    src/rocrate/Person.php (3)
    • Person (10-29)
    • __construct (16-19)
    • toArray (25-28)
    src/rocrate/ContactPoint.php (2)
    • __construct (17-20)
    • toArray (26-29)
    src/rocrate/ContextualEntity.php (3)
    • __construct (18-21)
    • ContextualEntity (11-32)
    • toArray (27-31)
    src/rocrate/Organization.php (2)
    • __construct (16-19)
    • toArray (25-28)
    src/rocrate/Place.php (2)
    • __construct (16-19)
    • toArray (25-28)
    src/rocrate/Publication.php (2)
    • __construct (20-28)
    • toArray (34-37)
    src/rocrate/JsonData.php (2)
    • __construct (23-26)
    • toArray (119-132)

    Comment on lines +391 to +399
    if (array_key_exists("about", $this->descriptor->getProperties())) {
    $conditionOne = (is_array($this->getDescriptor()->getProperty("about")));
    $conditionTwo = (strcmp($this->descriptor->getProperty("about")['@id'], $this->rootDataset->getId()) !== 0);
    if ($conditionOne && $conditionTwo) {
    $errors[] = "The descriptor's about property is invalid.";
    }
    } else {
    $errors[] = "The descriptor does not have an about property.";
    }
    Copy link

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    🛠️ Refactor suggestion

    Guard descriptor about structure to avoid undefined index

    Ensure the about property has the expected shape before dereferencing ['@id'].

    Apply this diff:

    -        if (array_key_exists("about", $this->descriptor->getProperties())) {
    -            $conditionOne = (is_array($this->getDescriptor()->getProperty("about")));
    -            $conditionTwo = (strcmp($this->descriptor->getProperty("about")['@id'], $this->rootDataset->getId()) !== 0);
    -            if ($conditionOne && $conditionTwo) {
    +        if (array_key_exists("about", $this->descriptor->getProperties())) {
    +            $about = $this->descriptor->getProperty("about");
    +            $conditionOne = is_array($about) && isset($about['@id']) && is_string($about['@id']);
    +            $conditionTwo = $conditionOne && (strcmp($about['@id'], $this->rootDataset->getId()) !== 0);
    +            if ($conditionOne && $conditionTwo) {
                     $errors[] = "The descriptor's about property is invalid.";
                 }
             } else {
    📝 Committable suggestion

    ‼️ IMPORTANT
    Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

    Suggested change
    if (array_key_exists("about", $this->descriptor->getProperties())) {
    $conditionOne = (is_array($this->getDescriptor()->getProperty("about")));
    $conditionTwo = (strcmp($this->descriptor->getProperty("about")['@id'], $this->rootDataset->getId()) !== 0);
    if ($conditionOne && $conditionTwo) {
    $errors[] = "The descriptor's about property is invalid.";
    }
    } else {
    $errors[] = "The descriptor does not have an about property.";
    }
    if (array_key_exists("about", $this->descriptor->getProperties())) {
    $about = $this->descriptor->getProperty("about");
    $conditionOne = is_array($about)
    && isset($about['@id'])
    && is_string($about['@id']);
    $conditionTwo = $conditionOne
    && (strcmp($about['@id'], $this->rootDataset->getId()) !== 0);
    if ($conditionOne && $conditionTwo) {
    $errors[] = "The descriptor's about property is invalid.";
    }
    } else {
    $errors[] = "The descriptor does not have an about property.";
    }
    🤖 Prompt for AI Agents
    In src/rocrate/ROCrate.php around lines 391 to 399, the code dereferences
    ['@id'] on the descriptor's about property without ensuring that the about value
    is actually an array with an '@id' key; update the guard to first fetch the
    about value, assert it's an array (is_array) and that array_key_exists('@id',
    $about) (or isset($about['@id'])), then compare that value to
    $this->rootDataset->getId(); if those checks fail, add the existing validation
    error(s) ("The descriptor's about property is invalid." or "The descriptor does
    not have an about property.") as appropriate so you never access ['@id'] on a
    non-array or missing key.
    

    Comment on lines +686 to +733
    public function save(?string $path = null, string $prefix = ""): void
    {
    $this->errors = [];

    // make values of all properties, i.e. key-value pairs, of each entity to be without [...] if there
    // is only a single literal or {"@id" : "..."}
    foreach ($this->entities as $entity) {
    foreach (array_keys($entity->getProperties()) as $key) {
    if (strcmp($key, "hasPart") == 0) {
    continue;
    }
    // safety check if $val is an array
    if (is_array($entity->getProperty($key))) {
    if (!array_key_exists('@id', $entity->getProperty($key))) {
    // safety check if $val is not an associative array
    if (count($entity->getProperty($key)) === 1) {
    // there is only a single item
    $entity->addProperty($key, $entity->getProperty($key)[0]);
    //$this->printNestedArray($this->descriptor->getProperties());
    }
    }
    }
    }
    }


    if (!$this->attached) {
    if (strcmp($prefix, "") == 0) {
    throw new ROCrateException("The prefix cannot be empty for a detached RO-Crate Package.");
    }
    }

    $this->errors = $this->validate();
    if (!($this->errors === [])) {
    throw new ROCrateException("Validation before saving failed.");
    }

    $target = $path ? realpath($path) : $this->basePath;

    if (!$target) {
    throw new ROCrateException("Invalid target directory: $path");
    }

    // Ensure metadata directory exists
    if (!is_dir($target) && !mkdir($target, 0755, true)) {
    throw new ROCrateException("Failed to create directory: $target");
    }

    Copy link

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    ⚠️ Potential issue

    Saving to a new directory path fails due to realpath() use

    If $path doesn't exist yet, realpath($path) returns false and you throw, instead of creating the directory. Use the provided path verbatim and create it when missing.

    Apply this diff:

    -        $target = $path ? realpath($path) : $this->basePath;
    -
    -        if (!$target) {
    -            throw new ROCrateException("Invalid target directory: $path");
    -        }
    -
    -        // Ensure metadata directory exists
    -        if (!is_dir($target) && !mkdir($target, 0755, true)) {
    +        $target = $path ?? $this->basePath;
    +        // Ensure metadata directory exists (create if missing)
    +        if (!is_dir($target) && !mkdir($target, 0755, true)) {
                 throw new ROCrateException("Failed to create directory: $target");
             }
    📝 Committable suggestion

    ‼️ IMPORTANT
    Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

    Suggested change
    public function save(?string $path = null, string $prefix = ""): void
    {
    $this->errors = [];
    // make values of all properties, i.e. key-value pairs, of each entity to be without [...] if there
    // is only a single literal or {"@id" : "..."}
    foreach ($this->entities as $entity) {
    foreach (array_keys($entity->getProperties()) as $key) {
    if (strcmp($key, "hasPart") == 0) {
    continue;
    }
    // safety check if $val is an array
    if (is_array($entity->getProperty($key))) {
    if (!array_key_exists('@id', $entity->getProperty($key))) {
    // safety check if $val is not an associative array
    if (count($entity->getProperty($key)) === 1) {
    // there is only a single item
    $entity->addProperty($key, $entity->getProperty($key)[0]);
    //$this->printNestedArray($this->descriptor->getProperties());
    }
    }
    }
    }
    }
    if (!$this->attached) {
    if (strcmp($prefix, "") == 0) {
    throw new ROCrateException("The prefix cannot be empty for a detached RO-Crate Package.");
    }
    }
    $this->errors = $this->validate();
    if (!($this->errors === [])) {
    throw new ROCrateException("Validation before saving failed.");
    }
    $target = $path ? realpath($path) : $this->basePath;
    if (!$target) {
    throw new ROCrateException("Invalid target directory: $path");
    }
    // Ensure metadata directory exists
    if (!is_dir($target) && !mkdir($target, 0755, true)) {
    throw new ROCrateException("Failed to create directory: $target");
    }
    // Replace realpath-based resolution so we can create a new path if needed
    $target = $path ?? $this->basePath;
    // Ensure metadata directory exists (create if missing)
    if (!is_dir($target) && !mkdir($target, 0755, true)) {
    throw new ROCrateException("Failed to create directory: $target");
    }

    Comment on lines +48 to +53
    // Generate HTML
    $html = $generator->generateHTML($rootEntity, $entities, $termUris, $basePath);
    file_put_contents(OUTPUT_HTML, $html);

    echo "Successful Creation of Preview file";
    }
    Copy link

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    🛠️ Refactor suggestion

    Write output using local variable to avoid global constants

    Complete the removal of dynamic constants by using $outputHtml.

    Apply this diff:

    -        $html = $generator->generateHTML($rootEntity, $entities, $termUris, $basePath);
    -        file_put_contents(OUTPUT_HTML, $html);
    +        $html = $generator->generateHTML($rootEntity, $entities, $termUris, $basePath);
    +        file_put_contents($outputHtml, $html);
    📝 Committable suggestion

    ‼️ IMPORTANT
    Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

    Suggested change
    // Generate HTML
    $html = $generator->generateHTML($rootEntity, $entities, $termUris, $basePath);
    file_put_contents(OUTPUT_HTML, $html);
    echo "Successful Creation of Preview file";
    }
    // Generate HTML
    $html = $generator->generateHTML($rootEntity, $entities, $termUris, $basePath);
    file_put_contents($outputHtml, $html);
    echo "Successful Creation of Preview file";
    }
    🤖 Prompt for AI Agents
    In src/rocrate/ROCratePreviewGenerator.php around lines 48 to 53, replace the
    use of the global OUTPUT_HTML constant with a local variable; create a
    $outputHtml variable (e.g. from a method param or computed path), write the
    generated HTML with file_put_contents($outputHtml, $html) and keep the success
    echo (optionally include $outputHtml in the message) so no global constant is
    referenced.
    

    Comment on lines +363 to +391
    if (is_array($values)) {
    $keyFirst = "<li><span class=\"property\">$keyHtml <a href=$resolvedKey> [?] </a> </span>:";
    foreach ($values as $valueHtml) {
    // if value is id, we make it hyperlink and show name if name exists in the entity
    $conditionOne = (!is_array($valueHtml)) && (strcmp($key, '@id') !== 0);
    $conditionTwo = (array_key_exists($valueHtml, $entities));
    if ($conditionOne && $conditionTwo) {
    $temp = htmlspecialchars($entities[$valueHtml]['name'] ?? $valueHtml);
    if (strcmp($temp, "") == 0) {
    $temp = $valueHtml;
    }
    if (strcmp(substr($keyFirst, -1), ':') == 0) {
    $html .= $keyFirst . " <a href=#$valueHtml> $temp </a></li>";
    } else {
    $html .= $keyFirst . " <a href=#$valueHtml> $temp </a>";
    }
    } elseif (ROCrate::isValidUri($valueHtml)) {
    if (strcmp(substr($keyFirst, -1), ':') == 0) {
    $html .= $keyFirst . " <a href=$valueHtml> $valueHtml </a></li>";
    } else {
    $html .= $keyFirst . " <a href=$valueHtml> $valueHtml </a>";
    }
    } else {
    if (strcmp(substr($keyFirst, -1), ':') == 0) {
    $html .= $keyFirst . " $valueHtml</li>";
    } else {
    $html .= $keyFirst . " $valueHtml";
    }
    }
    Copy link

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    ⚠️ Potential issue

    Anchor hrefs are unquoted and unescaped — XSS risk

    Multiple places inject $resolvedKey/$valueHtml into href without quotes or escaping. This can be exploited if metadata or context is compromised.

    Apply the pattern below across all anchor constructions in this block:

    -                    $keyFirst = "<li><span class=\"property\">$keyHtml <a href=$resolvedKey> [?] </a> </span>:";
    +                    $keyFirst = "<li><span class=\"property\">$keyHtml <a href=\"" . htmlspecialchars($resolvedKey, ENT_QUOTES) . "\">[?]</a></span>:";
    @@
    -                                $html .= $keyFirst . " <a href=#$valueHtml> $temp </a></li>";
    +                                $html .= $keyFirst . " <a href=\"#" . htmlspecialchars($valueHtml, ENT_QUOTES) . "\"> " . htmlspecialchars($temp) . " </a></li>";
    @@
    -                                $html .= $keyFirst . " <a href=$valueHtml> $valueHtml </a></li>";
    +                                $html .= $keyFirst . " <a href=\"" . htmlspecialchars($valueHtml, ENT_QUOTES) . "\"> " . htmlspecialchars($valueHtml) . " </a></li>";

    Recommendation: create small helpers to centralize escaping/anchor creation to avoid future regressions:

    private function escAttr(string $s): string { return htmlspecialchars($s, ENT_QUOTES); }
    private function esc(string $s): string { return htmlspecialchars($s); }
    private function link(string $href, string $text): string { return sprintf('<a href="%s">%s</a>', $this->escAttr($href), $this->esc($text)); }

    Then use $this->link(...) everywhere.

    Also applies to: 421-444

    🤖 Prompt for AI Agents
    In src/rocrate/ROCratePreviewGenerator.php around lines 363 to 391 (and
    similarly 421-444), anchor hrefs are inserted unquoted and without escaping
    creating XSS risk; add small helpers (private function escAttr(string $s):
    string { return htmlspecialchars($s, ENT_QUOTES); }, private function esc(string
    $s): string { return htmlspecialchars($s); }, private function link(string
    $href, string $text): string { return sprintf('<a href="%s">%s</a>',
    $this->escAttr($href), $this->esc($text)); }) and replace every raw "<a
    href=...>...</a>" construction in the block with $this->link($resolvedKey,
    $keyLabel) or $this->link($valueHtml, $temp) as appropriate, ensure all href
    values are passed through escAttr and all visible text through esc so attributes
    are quoted and HTML-escaped.
    

    Comment on lines +439 to +445
    $resolvedKey = $contextData[$valueHtml];
    if (strcmp(substr($keyFirst, -1), ':') == 0) {
    $html .= $keyFirst . " <a href=$resolvedKey> $valueHtml </a></li>";
    } else {
    $html .= $keyFirst . " <a href=$resolvedKey> $valueHtml </a>";
    }
    }
    Copy link

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    🛠️ Refactor suggestion

    Undefined index risk when resolving @type values via context

    Using $contextData[$valueHtml] without checking key existence can raise notices.

    Apply this diff:

    -                        } else {
    -                            $resolvedKey = $contextData[$valueHtml];
    +                        } else {
    +                            $resolvedKey = $contextData[$valueHtml] ?? null;
    +                            if ($resolvedKey === null) {
    +                                // Fallback: render the literal safely without link
    +                                $resolvedKey = '';
    +                            }
                                 if (strcmp(substr($keyFirst, -1), ':') == 0) {
    -                                $html .= $keyFirst . " <a href=$resolvedKey> $valueHtml </a></li>";
    +                                $html .= $keyFirst . ($resolvedKey !== '' ? " <a href=\"" . htmlspecialchars($resolvedKey, ENT_QUOTES) . "\"> " . htmlspecialchars($valueHtml) . " </a></li>" : " " . htmlspecialchars($valueHtml) . "</li>");
                                 } else {
    -                                $html .= $keyFirst . " <a href=$resolvedKey> $valueHtml </a>";
    +                                $html .= $keyFirst . ($resolvedKey !== '' ? " <a href=\"" . htmlspecialchars($resolvedKey, ENT_QUOTES) . "\"> " . htmlspecialchars($valueHtml) . " </a>" : " " . htmlspecialchars($valueHtml));
                                 }
    📝 Committable suggestion

    ‼️ IMPORTANT
    Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

    Suggested change
    $resolvedKey = $contextData[$valueHtml];
    if (strcmp(substr($keyFirst, -1), ':') == 0) {
    $html .= $keyFirst . " <a href=$resolvedKey> $valueHtml </a></li>";
    } else {
    $html .= $keyFirst . " <a href=$resolvedKey> $valueHtml </a>";
    }
    }
    } else {
    $resolvedKey = $contextData[$valueHtml] ?? null;
    if ($resolvedKey === null) {
    // Fallback: render the literal safely without link
    $resolvedKey = '';
    }
    if (strcmp(substr($keyFirst, -1), ':') == 0) {
    $html .= $keyFirst
    . ($resolvedKey !== ''
    ? " <a href=\""
    . htmlspecialchars($resolvedKey, ENT_QUOTES)
    . "\"> "
    . htmlspecialchars($valueHtml, ENT_QUOTES)
    . " </a></li>"
    : " " . htmlspecialchars($valueHtml, ENT_QUOTES) . "</li>"
    );
    } else {
    $html .= $keyFirst
    . ($resolvedKey !== ''
    ? " <a href=\""
    . htmlspecialchars($resolvedKey, ENT_QUOTES)
    . "\"> "
    . htmlspecialchars($valueHtml, ENT_QUOTES)
    . " </a>"
    : " " . htmlspecialchars($valueHtml, ENT_QUOTES)
    );
    }
    }
    🤖 Prompt for AI Agents
    In src/rocrate/ROCratePreviewGenerator.php around lines 439 to 445, the code
    accesses $contextData[$valueHtml] without verifying the key exists which can
    trigger PHP notices; update it to check isset($contextData[$valueHtml]) or
    array_key_exists($valueHtml, $contextData) before using the value, and only
    build the <a href=...> link when the resolved key exists; if the key is missing,
    fall back to outputting an escaped $valueHtml (or skip the link) to avoid
    undefined index notices.
    

    Copy link

    @coderabbitai coderabbitai bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Actionable comments posted: 10

    ♻️ Duplicate comments (11)
    resources/ro-crate-preview-out.html (6)

    7-7: Remove empty stylesheet href (avoid pointless request).

    Empty href triggers an unnecessary HTTP request and can delay first paint.

    Apply this diff:

    -        <link rel="stylesheet" href="">
    +        <!-- External stylesheet placeholder removed; add a real stylesheet when available -->

    227-233: Fix empty headings and names — use a fallback when “name” is empty.

    These sections render empty

    and empty “name”. At minimum, fall back to @id (basename) in this artifact; ideally, fix the generator to treat empty strings as missing.

    Apply this minimal content fix here:

    -                    <h2></h2>
    +                    <h2>ERR599171.merged.motus.tsv</h2>
    -                    <ul><li><span class="property"> @id </span>: results/ERR599171.merged.motus.tsv</li><li><span class="property"> @type </span>: File</li><li><span class="property">encodingFormat ...
    -                    ... <li><span class="property">name ... </span>: </li></ul>
    +                    <ul><li><span class="property"> @id </span>: results/ERR599171.merged.motus.tsv</li><li><span class="property"> @type </span>: File</li><li><span class="property">encodingFormat ...
    +                    ... <li><span class="property">name ... </span>: ERR599171.merged.motus.tsv</li></ul>
    -                    <h2></h2>
    +                    <h2>ERR599171.merged.unfiltered_fasta</h2>
    -                    <ul><li><span class="property"> @id </span>: results/ERR599171.merged.unfiltered_fasta</li><li><span class="property"> @type </span>: File</li><li><span class="property">encodingFormat ...
    -                    ... <li><span class="property">name ... </span>: </li></ul>
    +                    <ul><li><span class="property"> @id </span>: results/ERR599171.merged.unfiltered_fasta</li><li><span class="property"> @type </span>: File</li><li><span class="property">encodingFormat ...
    +                    ... <li><span class="property">name ... </span>: ERR599171.merged.unfiltered_fasta</li></ul>

    Follow-up: I can patch ROCratePreviewGenerator to add a displayName helper that prefers non-empty “name” and falls back to basename(@id). Want me to open a PR on that file?


    253-294: Correct MIME type for .gz files: application/gzip (not zip).

    All FASTA/FA/TSV assets compressed with .gz should use application/gzip. Some entries also miss encodingFormat; keep consistent.

    Apply this diff subset (repeat for each .fa.gz/.fasta.gz/.tsv.gz occurrence):

    -...<span class="property">encodingFormat ... </span>: application/zip</li>
    +...<span class="property">encodingFormat ... </span>: application/gzip</li>

    Examples to update in this block:

    • results/sequence-categorisation/5_8S.fa.gz
    • LSU_rRNA_archaea.RF02540.fa.gz
    • LSU_rRNA_bacteria.RF02541.fa.gz
    • LSU_rRNA_eukarya.RF02543.fa.gz
    • SSU_rRNA_archaea.RF01959.fa.gz
    • SSU_rRNA_bacteria.RF00177.fa.gz
    • SSU_rRNA_eukarya.RF01960.fa.gz
    • Bacteria_large_SRP.RF01854.fasta.gz
    • Bacteria_small_SRP.RF00169.fasta.gz
    • RNaseP_bact_a.RF00010.fasta.gz
    • alpha_tmRNA.RF01849.fasta.gz
    • cyano_tmRNA.RF01851.fasta.gz (also add missing encodingFormat if intended)
    • tRNA-Sec.RF01852.fasta.gz
    • tRNA.RF00005.fasta.gz
    • tmRNA.RF00023.fasta.gz
    
    ---
    
    `302-337`: **Fix typos and incorrect encodingFormat values (SSU/LSU, TSV/TXT/HTML/BIOM).**
    
    - “indentification” → “identification”.
    - SSU section heading/name currently says “LSU” — correct to SSU.
    - .tsv → text/tab-separated-values
    - .txt → text/plain
    - .html → text/html
    - .biom (HDF5) → application/x-hdf5
    - .biom (JSON) → application/biom+json (or application/json if you prefer stricter IANA types).
    
    
    Apply these diffs:
    
    ```diff
    -                    <h2>LSU sequences used for indentification</h2>
    +                    <h2>SSU sequences used for identification</h2>
    -...<span class="property">encodingFormat ... </span>: application/zip</li>
    -...<span class="property">name ... </span>: LSU sequences used for indentification</li>
    +...<span class="property">encodingFormat ... </span>: application/gzip</li>
    +...<span class="property">name ... </span>: SSU sequences used for identification</li>
    
    -...ERR599171.merged_SSU.fasta.mseq.tsv...</span>: application/json-ld</li>
    +...ERR599171.merged_SSU.fasta.mseq.tsv...</span>: text/tab-separated-values</li>
    -...ERR599171.merged_SSU.fasta.mseq.txt...</span>: application/json-ld</li>
    +...ERR599171.merged_SSU.fasta.mseq.txt...</span>: text/plain</li>
    -.../SSU/krona.html...</span>: application/html</li>
    +.../SSU/krona.html...</span>: text/html</li>
    -...merged_SSU.fasta.mseq_hdf5.biom...</span>: application/json-ld</li>
    +...merged_SSU.fasta.mseq_hdf5.biom...</span>: application/x-hdf5</li>
    -...merged_SSU.fasta.mseq_json.biom...</span>: application/json-ld</li>
    +...merged_SSU.fasta.mseq_json.biom...</span>: application/biom+json</li>

    And for LSU equivalents:

    -                    <h2>LSU sequences used for indentification</h2>
    +                    <h2>LSU sequences used for identification</h2>
    -.../LSU/ERR599171.merged_LSU.fasta.mseq.gz...</span>: application/zip</li>
    +.../LSU/ERR599171.merged_LSU.fasta.mseq.gz...</span>: application/gzip</li>
    -...merged_LSU.fasta.mseq.tsv...</span>: application/json-ld</li>
    +...merged_LSU.fasta.mseq.tsv...</span>: text/tab-separated-values</li>
    -...merged_LSU.fasta.mseq.txt...</span>: application/json-ld</li>
    +...merged_LSU.fasta.mseq.txt...</span>: text/plain</li>
    -.../LSU/krona.html...</span>: application/html</li>
    +.../LSU/krona.html...</span>: text/html</li>
    -...merged_LSU.fasta.mseq_hdf5.biom...</span>: application/json-ld</li>
    +...merged_LSU.fasta.mseq_hdf5.biom...</span>: application/x-hdf5</li>
    -...merged_LSU.fasta.mseq_json.biom...</span>: application/json-ld</li>
    +...merged_LSU.fasta.mseq_json.biom...</span>: application/biom+json</li>

    356-361: Correct MIME type for gzipped TSVs.

    .gz TSVs should be application/gzip.

    Apply:

    -...ERR599171.merged.hmm.tsv.gz...</span>: application/zip</li>
    +...ERR599171.merged.hmm.tsv.gz...</span>: application/gzip</li>
    -...ERR599171.merged_CDS.I5.tsv.gz...</span>: application/zip</li>
    +...ERR599171.merged_CDS.I5.tsv.gz...</span>: application/gzip</li>

    395-397: Fix broken @type list markup and normalize hasPart links.

    Types are split into malformed list items; hasPart contains an unlinked file name.

    Apply this diff:

    -                    <ul><li><span class="property"> @id </span>: #workflow/metaGOflow</li><li><span class="property"> @type </span>: File</li>,  SoftwareSourceCode</li>,  ComputationalWorkflow</li><li><span class="property">affiliation ... </span>: University of Flatland</li><li><span class="property">author ... </span>: EMO BON</li><li><span class="property">hasPart ... </span>: <a href=#config.yml> MetaGOflow configuration file </a></li>,  TARA_OCEANS_SAMPLE_3steps/ERR599171.yml</li><li><span class="property">license ... </span>: <a href=#https://www.apache.org/licenses/LICENSE-2.0> Apache License 2.0 </a></li><li><span class="property">name ... </span>: metaGOflow</li><li><span class="property">url ... </span>: <a href=https://github.com/emo-bon/MetaGOflow/releases/tag/v1.0.0> https://github.com/emo-bon/MetaGOflow/releases/tag/v1.0.0 </a></li></ul>
    +                    <ul>
    +                      <li><span class="property"> @id </span>: #workflow/metaGOflow</li>
    +                      <li><span class="property"> @type </span>: File, SoftwareSourceCode, ComputationalWorkflow</li>
    +                      <li><span class="property">affiliation ... </span>: University of Flatland</li>
    +                      <li><span class="property">author ... </span>: EMO BON</li>
    +                      <li><span class="property">hasPart ... </span>: <a href="#config.yml">MetaGOflow configuration file</a>, <a href="#ERR599171.yml">TARA_OCEANS_SAMPLE_3steps/ERR599171.yml</a></li>
    +                      <li><span class="property">license ... </span>: <a href="#https://www.apache.org/licenses/LICENSE-2.0">Apache License 2.0</a></li>
    +                      <li><span class="property">name ... </span>: metaGOflow</li>
    +                      <li><span class="property">url ... </span>: <a href="https://github.com/emo-bon/MetaGOflow/releases/tag/v1.0.0">https://github.com/emo-bon/MetaGOflow/releases/tag/v1.0.0</a></li>
    +                    </ul>

    assets/ro-crate-preview.html (4)

    7-7: Remove empty stylesheet href (avoid pointless request).

    Empty href issues an unnecessary request; remove until a real CSS file is available.

    Apply:

    -        <link rel="stylesheet" href="">
    +        <!-- External stylesheet placeholder removed; add one when available -->

    207-218: Sanitize section ids starting with “#” (normalize ids and hrefs).

    Section ids like id="#oneExtra" force brittle href="##oneExtra". Normalize ids by stripping the leading “#”; render href="#oneExtra".

    Example (apply consistently):

    -                            <section id="#oneExtra">
    +                            <section id="oneExtra">
    ...
    -                    <ul><li><span class="property"> @id </span>: #oneExtra</li>
    +                    <ul><li><span class="property"> @id </span>: #oneExtra</li>

    And update all references:

    -... href=##oneExtra ...
    +... href="#oneExtra" ...

    Prefer applying this in the generator to avoid repeating errors.


    185-187: Fix broken anchors and list markup (href=##..., stray commas/).

    • Change href=##something to href="#something".
    • Keep all hasPart links inside a single
    • ; remove stray commas/closing tags.

    Apply this focused fix to the first occurrences (repeat pattern for others in the line):

    -... <a href=##other-files> other files </a><li><span class="property">author ...
    +... <a href="#other-files"> other files </a></li><li><span class="property">author ...

    And for authors/exifData:

    -... <a href=##Xuan_Yang> Yang X </a>,  <a href=#https://orcid.org/0000-0003-0337-8722> Teng H </a>,  <a href=##Benjamin_Reames> Reames B </a> ...
    +... <a href="#Xuan_Yang"> Yang X </a>,  <a href="#https://orcid.org/0000-0003-0337-8722"> Teng H </a>,  <a href="#Benjamin_Reames"> Reames B </a> ...
    -... <a href=##datasetTypes> Dataset type </a></li>,  <a href=##additionalInfo1> Additional information </a>,  <a href=##additionalInfo2> ...
    +... <a href="#datasetTypes"> Dataset type </a>,  <a href="#additionalInfo1"> Additional information </a>,  <a href="#additionalInfo2"> ...

    204-206: Fix malformed @type list (“CreativeWork, Dataset”)

    @type is broken into multiple list items with a stray comma.

    Apply:

    -                    <ul><li><span class="property"> @id </span>: <a href=https://doi.org/10.5524/100425> https://doi.org/10.5524/100425 </a></li><li><span class="property"> @type </span>: <a href=http://schema.org/CreativeWork> CreativeWork </a></li>,  <a href=http://schema.org/Dataset> Dataset </a><li><span class="property">conformsTo ...
    +                    <ul><li><span class="property"> @id </span>: <a href=https://doi.org/10.5524/100425> https://doi.org/10.5524/100425 </a></li><li><span class="property"> @type </span>: <a href=http://schema.org/CreativeWork> CreativeWork </a>, <a href=http://schema.org/Dataset> Dataset </a></li><li><span class="property">conformsTo ...
    resources/ro-crate-preview.html (1)

    383-388: Clarify licensing scope to avoid ambiguity

    This preview includes both “Apache License 2.0” and “Creative Commons (CC-BY 4.0)”. Only the workflow explicitly references Apache-2.0; the root Dataset has no license shown here. Clarify at the root whether the dataset has a license and whether licenses apply per-entity.

    Suggestion:

    • Add a license property at the root Dataset if applicable.
    • For mixed licenses, add a short note in the root description explaining which entities are under which license, and ensure each entity section links to its specific license.

    Also applies to: 392-397, 185-186

    🧹 Nitpick comments (4)
    assets/ro-crate-preview.html (1)

    215-215: Tidy encodingFormat + PRONOM link into one list item.

    There’s a dangling “, ” outside the

  • . Keep related content together.

    Apply:

    -...<li><span class="property">encodingFormat ... </span>: application/zip</li>,  <a href=https://www.nationalarchives.gov.uk/PRONOM/x-fmt/263> https://www.nationalarchives.gov.uk/PRONOM/x-fmt/263 </a><li>...
    +...<li><span class="property">encodingFormat ... </span>: application/zip (<a href="https://www.nationalarchives.gov.uk/PRONOM/x-fmt/263">PRONOM x-fmt/263</a>)</li><li>...
  • resources/ro-crate-preview.html (3)

    185-186: Prefer HTTPS for schema.org and other vocab/namespace links

    Numerous anchors point to http://schema.org and http://purl.org/... Use HTTPS to avoid mixed-content warnings and follow best practices. schema.org and purl endpoints support HTTPS.

    Representative changes (apply broadly):

    -<a href=http://schema.org/Dataset> Dataset </a>
    +<a href=https://schema.org/Dataset> Dataset </a>
    -<a href=http://purl.org/dc/terms/conformsTo> [?] </a>
    +<a href=https://purl.org/dc/terms/conformsTo> [?] </a>

    Recommend fixing at the generator level to ensure consistent secure links.

    Also applies to: 189-190, 192-193, 195-196, 198-200, 201-205, 207-208, 210-214, 296-301, 384-391, 393-397


    80-88: Clean up unused CSS or align markup and styles

    • Styles for .property-list dt/dd exist, but markup uses
        /
      • . Either switch to
        or remove the unused CSS.
      • .back-to-top styles exist without a corresponding element or JS behavior.

      Not critical, but trimming or aligning improves maintainability.

      Also applies to: 137-139, 148-167


      1-405: General robustness of HTML ids for RO-Crate @id values

      Several section ids contain slashes and full URLs. While technically allowed, these make fragment references and CSS selectors fragile. Consider:

      • Using slugified HTML id attributes (e.g., data-id-hash) while displaying the original @id as text.
      • Keeping a data-rocrate-id attribute with the original @id for traceability and mapping.

      I can provide a small helper to slugify and map @id -> fragment consistently in the generator.

    <ul><li><span class="property"> @id </span>: <a href=https://assets.gigadb-cdn.net/live/images/datasets/32d9369e-500d-5347-8842-9fe46cdc3693/102736.png> https://assets.gigadb-cdn.net/live/images/datasets/32d9369e-500d-5347-8842-9fe46cdc3693/102736.png </a></li><li><span class="property"> @type </span>: <a href=http://schema.org/MediaObject> File </a></li></ul> </section>
    <section id="https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/readme_102736.txt">
    <h2>readme_102736.txt</h2>
    <ul><li><span class="property"> @id </span>: <a href=https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/readme_102736.txt> https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/readme_102736.txt </a></li><li><span class="property"> @type </span>: <a href=http://schema.org/MediaObject> File </a></li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: readme_102736.txt</li><li><span class="property">contentSize <a href=http://schema.org/contentSize> [?] </a> </span>: 9.30 kB</li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: text/txt</li><li><span class="property">exifData <a href=http://schema.org/exifData> [?] </a> </span>: <a href=##oneExtra> Extra Information </a></li></ul> </section>
    Copy link

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    ⚠️ Potential issue

    encodingFormat “text/txt” is invalid — use text/plain.

    text/txt is not a standard media type.

    Apply:

    -...<span class="property">encodingFormat ... </span>: text/txt</li>
    +...<span class="property">encodingFormat ... </span>: text/plain</li>
    📝 Committable suggestion

    ‼️ IMPORTANT
    Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

    Suggested change
    <ul><li><span class="property"> @id </span>: <a href=https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/readme_102736.txt> https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/readme_102736.txt </a></li><li><span class="property"> @type </span>: <a href=http://schema.org/MediaObject> File </a></li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: readme_102736.txt</li><li><span class="property">contentSize <a href=http://schema.org/contentSize> [?] </a> </span>: 9.30 kB</li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: text/txt</li><li><span class="property">exifData <a href=http://schema.org/exifData> [?] </a> </span>: <a href=##oneExtra> Extra Information </a></li></ul> </section>
    <ul><li><span class="property"> @id </span>: <a href=https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/readme_102736.txt> https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/readme_102736.txt </a></li><li><span class="property"> @type </span>: <a href=http://schema.org/MediaObject> File </a></li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: readme_102736.txt</li><li><span class="property">contentSize <a href=http://schema.org/contentSize> [?] </a> </span>: 9.30 kB</li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: text/plain</li><li><span class="property">exifData <a href=http://schema.org/exifData> [?] </a> </span>: <a href=##oneExtra> Extra Information </a></li></ul> </section>
    🤖 Prompt for AI Agents
    In assets/ro-crate-preview.html around line 197, the encodingFormat value is set
    to the non-standard media type "text/txt"; replace it with the standard MIME
    type "text/plain" so the file uses a valid media type (update the HTML rendering
    that outputs encodingFormat to emit text/plain instead of text/txt).
    

    Comment on lines +372 to +379
    <h2>Geno Ontology summary statistics</h2>
    <ul><li><span class="property"> @id </span>: results/functional-annotation/stats/go.stats</li><li><span class="property"> @type </span>: File</li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: text/plain</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: Geno Ontology summary statistics</li></ul> </section>
    <section id="results/functional-annotation/stats/ko.stats">
    <h2>Kegg Ontology summary statistics</h2>
    <ul><li><span class="property"> @id </span>: results/functional-annotation/stats/ko.stats</li><li><span class="property"> @type </span>: File</li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: text/plain</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: Kegg Ontology summary statistics</li></ul> </section>
    <section id="results/functional-annotation/stats/pfam.stats">
    <h2>Pfam summary statistcs</h2>
    <ul><li><span class="property"> @id </span>: results/functional-annotation/stats/pfam.stats</li><li><span class="property"> @type </span>: File</li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: text/plain</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: Pfam summary statistcs</li></ul> </section>
    Copy link

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    ⚠️ Potential issue

    Fix typos: “Geno Ontology” and “statistcs”.

    Use “Gene Ontology” and “statistics”.

    Apply this diff:

    -                    <h2>Geno Ontology summary statistics</h2>
    +                    <h2>Gene Ontology summary statistics</h2>
    -...<span class="property">name ... </span>: Geno Ontology summary statistics</li>
    +...<span class="property">name ... </span>: Gene Ontology summary statistics</li>
    -                    <h2>Pfam summary statistcs</h2>
    +                    <h2>Pfam summary statistics</h2>
    -...<span class="property">name ... </span>: Pfam summary statistcs</li>
    +...<span class="property">name ... </span>: Pfam summary statistics</li>
    📝 Committable suggestion

    ‼️ IMPORTANT
    Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

    Suggested change
    <h2>Geno Ontology summary statistics</h2>
    <ul><li><span class="property"> @id </span>: results/functional-annotation/stats/go.stats</li><li><span class="property"> @type </span>: File</li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: text/plain</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: Geno Ontology summary statistics</li></ul> </section>
    <section id="results/functional-annotation/stats/ko.stats">
    <h2>Kegg Ontology summary statistics</h2>
    <ul><li><span class="property"> @id </span>: results/functional-annotation/stats/ko.stats</li><li><span class="property"> @type </span>: File</li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: text/plain</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: Kegg Ontology summary statistics</li></ul> </section>
    <section id="results/functional-annotation/stats/pfam.stats">
    <h2>Pfam summary statistcs</h2>
    <ul><li><span class="property"> @id </span>: results/functional-annotation/stats/pfam.stats</li><li><span class="property"> @type </span>: File</li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: text/plain</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: Pfam summary statistcs</li></ul> </section>
    <h2>Gene Ontology summary statistics</h2>
    <ul><li><span class="property"> @id </span>: results/functional-annotation/stats/go.stats</li><li><span class="property"> @type </span>: File</li><li><span class="property">encodingFormat <a href="http://schema.org/encodingFormat"> [?] </a> </span>: text/plain</li><li><span class="property">name <a href="http://schema.org/name"> [?] </a> </span>: Gene Ontology summary statistics</li></ul> </section>
    <section id="results/functional-annotation/stats/ko.stats">
    <h2>Kegg Ontology summary statistics</h2>
    <ul><li><span class="property"> @id </span>: results/functional-annotation/stats/ko.stats</li><li><span class="property"> @type </span>: File</li><li><span class="property">encodingFormat <a href="http://schema.org/encodingFormat"> [?] </a> </span>: text/plain</li><li><span class="property">name <a href="http://schema.org/name"> [?] </a> </span>: Kegg Ontology summary statistics</li></ul> </section>
    <section id="results/functional-annotation/stats/pfam.stats">
    <h2>Pfam summary statistics</h2>
    <ul><li><span class="property"> @id </span>: results/functional-annotation/stats/pfam.stats</li><li><span class="property"> @type </span>: File</li><li><span class="property">encodingFormat <a href="http://schema.org/encodingFormat"> [?] </a> </span>: text/plain</li><li><span class="property">name <a href="http://schema.org/name"> [?] </a> </span>: Pfam summary statistics</li></ul> </section>
    🧰 Tools
    🪛 ast-grep (0.38.6)

    [warning] 372-372: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 372-372: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 375-375: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 375-375: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 378-378: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 378-378: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)

    🤖 Prompt for AI Agents
    In resources/ro-crate-preview-out.html around lines 372 to 379, fix the typos by
    replacing "Geno Ontology" with "Gene Ontology" and "statistcs" with "statistics"
    wherever they appear (update the <h2> headings and the corresponding name fields
    for the Gene Ontology and Pfam sections so they read "Gene Ontology summary
    statistics" and "Pfam summary statistics").
    

    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>RO-Crate Preview: MetaGoFlow Results</title>
    <link rel="stylesheet" href="">
    Copy link

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    ⚠️ Potential issue

    Remove empty stylesheet link to avoid unnecessary failing request

    Empty href on a stylesheet causes the browser to re-request the current HTML as CSS, producing console errors and wasted network work. You already have embedded styles.

    Apply this diff:

    -        <link rel="stylesheet" href="">
    +        <!-- no external stylesheet; styles embedded below -->
    📝 Committable suggestion

    ‼️ IMPORTANT
    Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

    Suggested change
    <link rel="stylesheet" href="">
    <!-- no external stylesheet; styles embedded below -->
    🤖 Prompt for AI Agents
    In resources/ro-crate-preview.html around line 7, remove the empty stylesheet
    link (<link rel="stylesheet" href="">) so the browser does not re-request the
    HTML as CSS; either delete that tag entirely or replace href with the correct
    stylesheet path if a real external stylesheet is required.
    

    Comment on lines 252 to 276
    <h2>Sequence categorisation</h2>
    <ul><li><span class="property"> @id </span>: results/sequence-categorisation/5_8S.fa.gz</li><li><span class="property"> @type </span>: <a href=http://schema.org/MediaObject> File </a></li><li><span class="property">description <a href=http://schema.org/description> [?] </a> </span>: Identify specific loci in the sample.</li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: application/zip</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: Sequence categorisation</li></ul> </section>
    <section id="results/sequence-categorisation/LSU_rRNA_archaea.RF02540.fa.gz">
    <h2>Sequence categorisation</h2>
    <ul><li><span class="property"> @id </span>: results/sequence-categorisation/LSU_rRNA_archaea.RF02540.fa.gz</li><li><span class="property"> @type </span>: <a href=http://schema.org/MediaObject> File </a></li><li><span class="property">description <a href=http://schema.org/description> [?] </a> </span>: Identify specific loci in the sample.</li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: application/zip</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: Sequence categorisation</li></ul> </section>
    <section id="results/sequence-categorisation/LSU_rRNA_bacteria.RF02541.fa.gz">
    <h2>Sequence categorisation</h2>
    <ul><li><span class="property"> @id </span>: results/sequence-categorisation/LSU_rRNA_bacteria.RF02541.fa.gz</li><li><span class="property"> @type </span>: <a href=http://schema.org/MediaObject> File </a></li><li><span class="property">description <a href=http://schema.org/description> [?] </a> </span>: Identify specific loci in the sample.</li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: application/zip</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: Sequence categorisation</li></ul> </section>
    <section id="results/sequence-categorisation/LSU_rRNA_eukarya.RF02543.fa.gz">
    <h2>Sequence categorisation</h2>
    <ul><li><span class="property"> @id </span>: results/sequence-categorisation/LSU_rRNA_eukarya.RF02543.fa.gz</li><li><span class="property"> @type </span>: <a href=http://schema.org/MediaObject> File </a></li><li><span class="property">description <a href=http://schema.org/description> [?] </a> </span>: Identify specific loci in the sample.</li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: application/zip</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: Sequence categorisation</li></ul> </section>
    <section id="results/sequence-categorisation/SSU_rRNA_archaea.RF01959.fa.gz">
    <h2>Sequence categorisation</h2>
    <ul><li><span class="property"> @id </span>: results/sequence-categorisation/SSU_rRNA_archaea.RF01959.fa.gz</li><li><span class="property"> @type </span>: <a href=http://schema.org/MediaObject> File </a></li><li><span class="property">description <a href=http://schema.org/description> [?] </a> </span>: Identify specific loci in the sample.</li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: application/zip</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: Sequence categorisation</li></ul> </section>
    <section id="results/sequence-categorisation/SSU_rRNA_bacteria.RF00177.fa.gz">
    <h2>Sequence categorisation</h2>
    <ul><li><span class="property"> @id </span>: results/sequence-categorisation/SSU_rRNA_bacteria.RF00177.fa.gz</li><li><span class="property"> @type </span>: <a href=http://schema.org/MediaObject> File </a></li><li><span class="property">description <a href=http://schema.org/description> [?] </a> </span>: Identify specific loci in the sample.</li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: application/zip</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: Sequence categorisation</li></ul> </section>
    <section id="results/sequence-categorisation/SSU_rRNA_eukarya.RF01960.fa.gz">
    <h2>Sequence categorisation</h2>
    <ul><li><span class="property"> @id </span>: results/sequence-categorisation/SSU_rRNA_eukarya.RF01960.fa.gz</li><li><span class="property"> @type </span>: <a href=http://schema.org/MediaObject> File </a></li><li><span class="property">description <a href=http://schema.org/description> [?] </a> </span>: Identify specific loci in the sample.</li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: application/zip</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: Sequence categorisation</li></ul> </section>
    <section id="results/sequence-categorisation/Bacteria_large_SRP.RF01854.fasta.gz">
    <h2>Sequence categorisation</h2>
    <ul><li><span class="property"> @id </span>: results/sequence-categorisation/Bacteria_large_SRP.RF01854.fasta.gz</li><li><span class="property"> @type </span>: <a href=http://schema.org/MediaObject> File </a></li><li><span class="property">description <a href=http://schema.org/description> [?] </a> </span>: Identify specific loci in the sample.</li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: application/zip</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: Sequence categorisation</li></ul> </section>
    <section id="results/sequence-categorisation/Bacteria_small_SRP.RF00169.fasta.gz">
    <h2>Sequence categorisation</h2>
    Copy link

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    💡 Verification agent

    🧩 Analysis chain

    Use application/gzip for .gz files (not application/zip)

    Many encodingFormat values for “.fa.gz/.fasta.gz/.tsv.gz” use application/zip. These are gzip archives, not ZIP containers.

    Apply representative fixes (repeat similarly for all .gz entries):

    -...5_8S.fa.gz...</span> : application/zip
    +...5_8S.fa.gz...</span> : application/gzip
    -...LSU_rRNA_archaea.RF02540.fa.gz...</span> : application/zip
    +...LSU_rRNA_archaea.RF02540.fa.gz...</span> : application/gzip
    -...ERR599171.merged.hmm.tsv.gz...</span> : application/zip
    +...ERR599171.merged.hmm.tsv.gz...</span> : application/gzip
    -...ERR599171.merged_CDS.I5.tsv.gz...</span> : application/zip
    +...ERR599171.merged_CDS.I5.tsv.gz...</span> : application/gzip

    I recommend fixing this in the preview generator so the correct media type is emitted consistently.

    Also applies to: 279-279, 282-282, 288-288, 291-291, 294-294


    🏁 Script executed:

    #!/bin/bash
    # Locate all occurrences of application/zip in the preview HTML to confirm .gz files are mislabeled
    rg -n "application/zip" -C2 resources/ro-crate-preview.html

    Length of output: 14850


    Correct MIME type for .gz archives
    All encodingFormat entries for files ending in .gz are currently set to application/zip. Since these are gzip archives, they must be application/gzip. Please update the preview generator so that every .gz file emitted in resources/ro-crate-preview.html uses the correct media type.

    • File: resources/ro-crate-preview.html
    – Every <li>…encodingFormat…: application/zip</li> where the associated @id ends with .gz
    • Replace application/zip with application/gzip for all such entries.

    🧰 Tools
    🪛 ast-grep (0.38.6)

    [warning] 252-252: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: File
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 252-252: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 252-252: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 252-252: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 255-255: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: File
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 255-255: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 255-255: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 255-255: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 258-258: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: File
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 258-258: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 258-258: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 258-258: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 261-261: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: File
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 261-261: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 261-261: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 261-261: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 264-264: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: File
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 264-264: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 264-264: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 264-264: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 267-267: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: File
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 267-267: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 267-267: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 267-267: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 270-270: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: File
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 270-270: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 270-270: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 270-270: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 273-273: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: File
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 273-273: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 273-273: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 273-273: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 276-276: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: File
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 276-276: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 276-276: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 276-276: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)

    🤖 Prompt for AI Agents
    In resources/ro-crate-preview.html around lines 252–276, the encodingFormat for
    files whose @id ends with .gz is incorrectly hard-coded as application/zip;
    update the preview generator/template so that when the file path (the @id)
    endsWith('.gz') you emit encodingFormat: application/gzip (leave other formats
    unchanged), i.e. detect the .gz suffix before writing the <li> entry and
    substitute application/gzip for those cases.
    

    Comment on lines 306 to 319
    <h2>Tab-separated formatted taxon counts for SSU sequences</h2>
    <ul><li><span class="property"> @id </span>: results/taxonomy-summary/SSU/ERR599171.merged_SSU.fasta.mseq.tsv</li><li><span class="property"> @type </span>: <a href=http://schema.org/MediaObject> File </a></li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: application/json-ld</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: Tab-separated formatted taxon counts for SSU sequences</li></ul> </section>
    <section id="results/taxonomy-summary/SSU/ERR599171.merged_SSU.fasta.mseq.txt">
    <h2>Text-based formatted taxon counts for SSU sequences</h2>
    <ul><li><span class="property"> @id </span>: results/taxonomy-summary/SSU/ERR599171.merged_SSU.fasta.mseq.txt</li><li><span class="property"> @type </span>: <a href=http://schema.org/MediaObject> File </a></li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: application/json-ld</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: Text-based formatted taxon counts for SSU sequences</li></ul> </section>
    <section id="results/taxonomy-summary/SSU/krona.html">
    <h2>Krona summary of LSU taxonomic inventory</h2>
    <ul><li><span class="property"> @id </span>: results/taxonomy-summary/SSU/krona.html</li><li><span class="property"> @type </span>: <a href=http://schema.org/MediaObject> File </a></li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: application/html</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: Krona summary of LSU taxonomic inventory</li></ul> </section>
    <section id="results/taxonomy-summary/SSU/ERR599171.merged_SSU.fasta.mseq_hdf5.biom">
    <h2>BIOM formatted hdf5 taxon counts for SSU sequences</h2>
    <ul><li><span class="property"> @id </span>: results/taxonomy-summary/SSU/ERR599171.merged_SSU.fasta.mseq_hdf5.biom</li><li><span class="property"> @type </span>: <a href=http://schema.org/MediaObject> File </a></li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: application/json-ld</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: BIOM formatted hdf5 taxon counts for SSU sequences</li></ul> </section>
    <section id="results/taxonomy-summary/SSU/ERR599171.merged_SSU.fasta.mseq_json.biom">
    <h2>BIOM formatted taxon counts for SSU sequences</h2>
    <ul><li><span class="property"> @id </span>: results/taxonomy-summary/SSU/ERR599171.merged_SSU.fasta.mseq_json.biom</li><li><span class="property"> @type </span>: <a href=http://schema.org/MediaObject> File </a></li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: application/json-ld</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: BIOM formatted taxon counts for SSU sequences</li></ul> </section>
    Copy link

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    💡 Verification agent

    ❓ Verification inconclusive

    Correct encodingFormat media types for TSV/TXT/HTML/BIOM

    Several encodingFormat values are inaccurate:

    • TSV: use text/tab-separated-values
    • TXT: use text/plain
    • HTML: use text/html (not application/html)
    • BIOM (HDF5): use application/x-hdf5
    • BIOM (JSON): use application/json

    Apply these diffs:

    -...mseq.tsv...</span> : application/json-ld
    +...mseq.tsv...</span> : text/tab-separated-values
    -...mseq.txt...</span> : application/json-ld
    +...mseq.txt...</span> : text/plain
    -...SSU/krona.html...</span> : application/html
    +...SSU/krona.html...</span> : text/html
    -...mseq_hdf5.biom...</span> : application/json-ld
    +...mseq_hdf5.biom...</span> : application/x-hdf5
    -...mseq_json.biom...</span> : application/json-ld
    +...mseq_json.biom...</span> : application/json

    Repeat the TSV/TXT/HTML/HDF5/JSON corrections for the LSU section (lines 323–337) accordingly.

    Also applies to: 324-337, 312-313, 330-331


    Correct encodingFormat media types in RO-Crate preview

    The current encodingFormat values in resources/ro-crate-preview.html are incorrect for TSV, TXT, HTML and BIOM files. Please update the SSU section (lines 306–319) and the corresponding LSU section (lines 323–337) as follows:

    • change TSV files from application/json-ld to text/tab-separated-values
    • change TXT files from application/json-ld to text/plain
    • change HTML files from application/html to text/html
    • change BIOM HDF5 files from application/json-ld to application/x-hdf5
    • change BIOM JSON files from application/json-ld to application/json

    Apply these diffs in both SSU and LSU sections:

    -<li><span class="property">encodingFormat</span>: application/json-ld</li>
    +<li><span class="property">encodingFormat</span>: text/tab-separated-values</li>

    (for .mseq.tsv)

    -<li><span class="property">encodingFormat</span>: application/json-ld</li>
    +<li><span class="property">encodingFormat</span>: text/plain</li>

    (for .mseq.txt)

    -<li><span class="property">encodingFormat</span>: application/html</li>
    +<li><span class="property">encodingFormat</span>: text/html</li>

    (for krona.html)

    -<li><span class="property">encodingFormat</span>: application/json-ld</li>
    +<li><span class="property">encodingFormat</span>: application/x-hdf5</li>

    (for .mseq_hdf5.biom)

    -<li><span class="property">encodingFormat</span>: application/json-ld</li>
    +<li><span class="property">encodingFormat</span>: application/json</li>

    (for .mseq_json.biom)

    Committable suggestion skipped: line range outside the PR's diff.

    🧰 Tools
    🪛 ast-grep (0.38.6)

    [warning] 306-306: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: File
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 306-306: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 306-306: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 309-309: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: File
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 309-309: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 309-309: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 312-312: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: File
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 312-312: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 312-312: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 315-315: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: File
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 315-315: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 315-315: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 318-318: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: File
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 318-318: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 318-318: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)

    🤖 Prompt for AI Agents
    In resources/ro-crate-preview.html around lines 306–319 (SSU) and the
    corresponding LSU block around lines 323–337, the encodingFormat media types are
    wrong; update each file entry: change .mseq.tsv entries from application/json-ld
    to text/tab-separated-values, change .mseq.txt entries from application/json-ld
    to text/plain, change krona.html entries from application/html to text/html,
    change .mseq_hdf5.biom entries from application/json-ld to application/x-hdf5,
    and change .mseq_json.biom entries from application/json-ld to application/json;
    apply the same replacements in both SSU and LSU sections so each <li>
    encodingFormat reflects the correct media type.
    

    Comment on lines 371 to 379
    <section id="results/functional-annotation/stats/go.stats">
    <h2>Geno Ontology summary statistics</h2>
    <ul><li><span class="property"> @id </span>: results/functional-annotation/stats/go.stats</li><li><span class="property"> @type </span>: <a href=http://schema.org/MediaObject> File </a></li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: text/plain</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: Geno Ontology summary statistics</li></ul> </section>
    <section id="results/functional-annotation/stats/ko.stats">
    <h2>Kegg Ontology summary statistics</h2>
    <ul><li><span class="property"> @id </span>: results/functional-annotation/stats/ko.stats</li><li><span class="property"> @type </span>: <a href=http://schema.org/MediaObject> File </a></li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: text/plain</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: Kegg Ontology summary statistics</li></ul> </section>
    <section id="results/functional-annotation/stats/pfam.stats">
    <h2>Pfam summary statistcs</h2>
    <ul><li><span class="property"> @id </span>: results/functional-annotation/stats/pfam.stats</li><li><span class="property"> @type </span>: <a href=http://schema.org/MediaObject> File </a></li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: text/plain</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: Pfam summary statistcs</li></ul> </section>
    Copy link

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    ⚠️ Potential issue

    Fix typos: Gene/KEGG and statistics

    These are user-visible labels.

    Apply this diff:

    -                    <h2>Geno Ontology summary statistics</h2>
    +                    <h2>Gene Ontology summary statistics</h2>
    @@
    -                    <h2>Kegg Ontology summary statistics</h2>
    +                    <h2>KEGG ontology summary statistics</h2>
    @@
    -                    <h2>Pfam summary statistcs</h2>
    +                    <h2>Pfam summary statistics</h2>
    📝 Committable suggestion

    ‼️ IMPORTANT
    Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

    Suggested change
    <section id="results/functional-annotation/stats/go.stats">
    <h2>Geno Ontology summary statistics</h2>
    <ul><li><span class="property"> @id </span>: results/functional-annotation/stats/go.stats</li><li><span class="property"> @type </span>: <a href=http://schema.org/MediaObject> File </a></li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: text/plain</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: Geno Ontology summary statistics</li></ul> </section>
    <section id="results/functional-annotation/stats/ko.stats">
    <h2>Kegg Ontology summary statistics</h2>
    <ul><li><span class="property"> @id </span>: results/functional-annotation/stats/ko.stats</li><li><span class="property"> @type </span>: <a href=http://schema.org/MediaObject> File </a></li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: text/plain</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: Kegg Ontology summary statistics</li></ul> </section>
    <section id="results/functional-annotation/stats/pfam.stats">
    <h2>Pfam summary statistcs</h2>
    <ul><li><span class="property"> @id </span>: results/functional-annotation/stats/pfam.stats</li><li><span class="property"> @type </span>: <a href=http://schema.org/MediaObject> File </a></li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: text/plain</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: Pfam summary statistcs</li></ul> </section>
    <section id="results/functional-annotation/stats/go.stats">
    <h2>Gene Ontology summary statistics</h2>
    <ul><li><span class="property"> @id </span>: results/functional-annotation/stats/go.stats</li><li><span class="property"> @type </span>: <a href=http://schema.org/MediaObject> File </a></li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: text/plain</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: Gene Ontology summary statistics</li></ul> </section>
    <section id="results/functional-annotation/stats/ko.stats">
    <h2>KEGG ontology summary statistics</h2>
    <ul><li><span class="property"> @id </span>: results/functional-annotation/stats/ko.stats</li><li><span class="property"> @type </span>: <a href=http://schema.org/MediaObject> File </a></li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: text/plain</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: KEGG ontology summary statistics</li></ul> </section>
    <section id="results/functional-annotation/stats/pfam.stats">
    <h2>Pfam summary statistics</h2>
    <ul><li><span class="property"> @id </span>: results/functional-annotation/stats/pfam.stats</li><li><span class="property"> @type </span>: <a href=http://schema.org/MediaObject> File </a></li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: text/plain</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: Pfam summary statistics</li></ul> </section>
    🧰 Tools
    🪛 ast-grep (0.38.6)

    [warning] 372-372: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: File
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 372-372: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 372-372: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 375-375: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: File
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 375-375: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 375-375: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 378-378: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: File
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 378-378: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 378-378: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)

    🤖 Prompt for AI Agents
    In resources/ro-crate-preview.html around lines 371-379, update the user-visible
    labels to fix typos: change "Geno Ontology summary statistics" to "Gene Ontology
    summary statistics" and update its corresponding name value; change "Kegg
    Ontology summary statistics" to "KEGG Ontology summary statistics" and update
    its name value; correct "Pfam summary statistcs" to "Pfam summary statistics" in
    both the h2 and the name field. Ensure only the displayed text and name strings
    are edited (no structural changes).
    

    Comment on lines 395 to 397
    <section id="#workflow/metaGOflow">
    <h2>metaGOflow</h2>
    <ul><li><span class="property"> @id </span>: #workflow/metaGOflow</li><li><span class="property"> @type </span>: <a href=http://schema.org/MediaObject> File </a></li>, <a href=http://schema.org/SoftwareSourceCode> SoftwareSourceCode </a>, <a href=https://bioschemas.org/ComputationalWorkflow> ComputationalWorkflow </a><li><span class="property">affiliation <a href=http://schema.org/affiliation> [?] </a> </span>: University of Flatland</li><li><span class="property">author <a href=http://schema.org/author> [?] </a> </span>: EMO BON</li><li><span class="property">hasPart <a href=http://schema.org/hasPart> [?] </a> </span>: <a href=#config.yml> MetaGOflow configuration file </a></li>, TARA_OCEANS_SAMPLE_3steps/ERR599171.yml<li><span class="property">license <a href=http://schema.org/license> [?] </a> </span>: <a href=#https://www.apache.org/licenses/LICENSE-2.0> Apache License 2.0 </a></li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: metaGOflow</li><li><span class="property">url <a href=http://schema.org/url> [?] </a> </span>: <a href=https://github.com/emo-bon/MetaGOflow/releases/tag/v1.0.0> https://github.com/emo-bon/MetaGOflow/releases/tag/v1.0.0 </a></li></ul> </section>
    Copy link

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    ⚠️ Potential issue

    Sanitize section id: hash character in HTML id breaks fragment links

    Using id="#workflow/metaGOflow" makes the fragment unresolvable because the '#' is not part of the element id in HTML. Use the literal fragment without '#'.

    Apply this diff:

    -                            <section id="#workflow/metaGOflow">
    +                            <section id="workflow/metaGOflow">
    ...
    -                    <ul><li><span class="property"> @id </span>: #workflow/metaGOflow</li>
    +                    <ul><li><span class="property"> @id </span>: #workflow/metaGOflow</li>

    Note: Keep displaying the RO-Crate @id with '#', but ensure any internal links use href="#workflow/metaGOflow".

    Committable suggestion skipped: line range outside the PR's diff.

    🧰 Tools
    🪛 ast-grep (0.38.6)

    [warning] 396-396: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: File
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 396-396: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: SoftwareSourceCode
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 396-396: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 396-396: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 396-396: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 396-396: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 396-396: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)


    [warning] 396-396: "This link points to a plaintext HTTP URL. Prefer an encrypted HTTPS URL if possible."
    Context: [?]
    Note: [CWE-319] Authentication Bypass by Primary Weakness [REFERENCES]
    - https://cwe.mitre.org/data/definitions/319.html

    (plaintext-http-link-html)

    🤖 Prompt for AI Agents
    In resources/ro-crate-preview.html around lines 395–397, the section element
    uses id="#workflow/metaGOflow" which breaks fragment links; change the element
    id to id="workflow/metaGOflow" (remove the leading '#') and ensure any internal
    anchor links that point to this fragment use href="#workflow/metaGOflow"; keep
    showing the RO-Crate @id string with the leading '#' in the displayed text only,
    not in the HTML id attribute.
    

    Copy link

    @coderabbitai coderabbitai bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Actionable comments posted: 3

    ♻️ Duplicate comments (1)
    Guide.md (1)

    63-63: Avoid “additional” redundancy

    “add an additional type” is redundant; addressed in the broader rewrite above, but flagging here for completeness.

    🧹 Nitpick comments (14)
    Guide.md (14)

    2-3: Tighten the intro and fix casing/plurals

    Clarifies audience and fixes minor phrasing/capitalization.

    -# Usage Guide for ro-crate-php
    -Below are some notes to pay attention to when developers are using the tool to manipulate RO-Crate Metadata file. The note is GigaDB-oriented. This note is written to help ease the use of the tool to create or manipulate the RO-Crate Metadata file concerning GigaDB datasets by removing unnecessary and only emphasizing relevant technical details about the specific standard of RO-Crate 1.2.
    +# Usage guide for ro-crate-php
    +Notes for developers using this tool to manipulate RO‑Crate metadata files. This guide is GigaDB‑oriented and focuses on RO‑Crate 1.2, emphasizing only the details relevant to GigaDB datasets.

    15-23: Clarify rules for @id vs literal and crate insertion

    Improves precision and grammar; “before or after” is ambiguous.

    -## Note
    -The general rule is that we use the @id construct (<b>true</b> flag if using the add/removePropertyPair methods) when referring to another entity, we otherwise use a plain literal (<b>false</b> flag if using the add/removePropertyPair methods). There are exceptions for specific constructs not following the rules.
    -
    -Another reminder is to add the entity to the crate before or after the creation of the entity.
    -
    -Also, only one entity with the same ID has to be created.
    -
    -In addition, name of an entity should be human-readable if it exists.
    +## Note
    +- Use an @id reference (pass <b>true</b> to add/removePropertyPair) when referring to another entity; otherwise use a plain literal (pass <b>false</b>).
    +- Add each entity to the crate exactly once; an @id must be unique within the crate.
    +- Ensure entity names are human‑readable when present.

    24-25: Clarify detached package behavior and preview naming

    State explicitly that the Descriptor’s @id remains constant even if the on-disk filename changes; address preview symmetry.

    -The metadata file always has ro-crate-metadata.json as the @id. The preview file has ro-crate-preview.json as the @id and filename. In detached package, i.e. the metadata file is not within the package, which is most likely for GigaDB, the filename ro-crate-metadata.json is renamed to xxxx-ro-crate-metadata.json, e.g. xxxx can be the dataset ID.
    +The metadata Descriptor entity always has @id "ro-crate-metadata.json". In a detached package (metadata file stored outside the payload), which is likely for GigaDB, the actual filename on disk may be renamed to "<dataset-id>-ro-crate-metadata.json", but its @id inside the JSON‑LD remains "ro-crate-metadata.json".
    +The preview file typically uses "ro-crate-preview.json" as both filename and @id; if you rename the file on disk for detached delivery, keep the internal @id consistent with your convention and document it.

    50-50: Subject–verb agreement and tone

    Minor grammar.

    -- **Thumbnail Handling**: When the thumbnail is incidental to the root dataset, usually the case, we do not include it in the hasPart of the root data entity and creates a File entity with the respective ID.
    +- **Thumbnail Handling**: When the thumbnail is incidental to the root dataset (usually the case), do not include it in the root’s hasPart; create a File entity with the respective @id instead.

    53-54: Improve Publisher/sdPublisher guidance and schema alignment

    Grammar fixes and clearer ContactPoint guidance.

    -- **Publisher and sdPublisher Handling**: Create a Organization entity with the respective ID, then set the name and description of the organization, where the name and the description may have to be searched or recorded online. Also, set the contactPoint with usually the email following *mailto:*, e.g. **[mailto:database@gigasciencejournal.com](mailto:database@gigasciencejournal.com)**. Then, create a contactPoint entity with this respective ID, and set the contactType, email and identifier. For the case of the example ID, the email and identifier can share a plain string database@gigasciencejournal.com, while the contactType may be a plain string saying the contact of the publisher.
    +- **Publisher and sdPublisher Handling**: Create an Organization entity with the respective @id, then set its name and description (you may need to source these online). Add a contactPoint using a mailto: URL (e.g., **mailto:database@gigasciencejournal.com**). Create the corresponding ContactPoint entity with that @id and set contactType and email. You may include identifier as the same email string if desired.

    67-68: Tighten author guidance and fix articles/capitalization

    Minor grammar and ROR capitalization.

    -- **Author Handling**: Create a Person entity with the respective ID, then set the affiliation and the name. The affiliation should refer to an Organization entity. In case that such entity does not exist yet, we create a Organization entity with the respective ID, then set the name, where the name may have to be searched or recorded online. For a GigaDB dataset, a ror is often picked as the ID for the organization, for instance, **[https://ror.org/01ej9dk98](https://ror.org/01ej9dk98)**.
    +- **Author Handling**: Create a Person entity with the respective @id, then set name and affiliation. The affiliation should reference an Organization entity. If it does not exist yet, create an Organization with the respective @id and set its name (you may need to source this online). For GigaDB, a ROR URL is often used as the organization @id (e.g., **https://ror.org/01ej9dk98**).

    70-71: Clarify funder guidance and grammar

    Improves readability; optional note on Award info via exifData retained.

    -- **Funder Handling**: Create an Organization entity with the respective ID, then set the identifier, name and description. The identifier is always to be the same as the ID, and the description is Funding Body in this case. Additionally, we can use the exifData construct to include the information regarding the Awardee and the Award ID.
    +- **Funder Handling**: Create an Organization entity with the respective @id, then set identifier (same value as the @id), name, and description (e.g., “Funding Body”). Optionally, include Awardee and Award ID via the exifData construct.

    73-74: Capitalize URL and tighten phrasing

    Minor improvements.

    -- **About Handling**: If the respective ID is an url, we create a contextual entity of type URL with the respective ID and set the name of the entity.
    +- **About Handling**: If the respective @id is a URL, create a contextual entity of type URL with that @id and set its name.

    77-79: Polish the closing remark

    Minor clarity and grammar.

    -There are other ways to create a RO-Crate Metadata document for a GigaDB dataset. This only serves as a rather minimal possible way to construct the document, where not all possible metadata of all entities are included. For manipulating an existing metadata document, we can similarly refer to these steps to look for missing parts.
    +There are other valid ways to create a RO‑Crate metadata document for a GigaDB dataset. This guide presents a minimal approach and does not enumerate all possible metadata for all entities. When adapting an existing metadata document, you can use these steps to identify and fill gaps.

    32-32: Fix typo: “downloable” → “downloadable”

    Simple spelling correction.

    --  **Specification of the Thumbnail**: Specify the ID of the thumbnail using thumbnail. The ID is recommended to be the corresponding downloable png. Refer to **[<b>Step 4</b>](#step-4)** for handling the contextual entity of the thumbnail.
    +-  **Specification of the Thumbnail**: Specify the ID of the thumbnail using thumbnail. The ID is recommended to be the corresponding downloadable PNG. Refer to **[<b>Step 4</b>](#step-4)** for handling the contextual entity of the thumbnail.

    29-30: Fix “are is” and articles; ISO formatting

    Minor grammar in Step 1 text.

    -... The dates are is ISO8601 standard, e.g. YYYY-MM-DD. For GigaDB, the dataset is most likely to be web-based, the ID has to be an absolute URI ...
    +... The dates must follow the ISO 8601 standard (e.g., YYYY‑MM‑DD). For GigaDB, the dataset is most likely web‑based; the @id must be an absolute URI ...

    39-39: Keywords guidance: consider allowing list form

    Schema.org allows keywords as Text or URL; comma‑separated is common, but arrays are also handled in JSON‑LD. Optional note if your library supports arrays.

    -As a special construct, together with the use of keywords property, we have to specify the IDs of these keyword(s) one by one using about as @id's.
    +Optionally, you can also express individual concept URIs for keywords using about with @id values. If your tooling supports it, keywords may also be provided as a list of strings.

    Please confirm whether your PHP library supports an array value for keywords.


    53-53: markdownlint MD034 (bare URL)

    If you enforce markdownlint, wrap bare URLs in link markup or autolink brackets.

    -... e.g. **[mailto:database@gigasciencejournal.com](mailto:database@gigasciencejournal.com)** ...
    +... e.g. <mailto:database@gigasciencejournal.com> ...

    1-85: Great addition—guide is comprehensive and maps well to the library

    Overall this is a solid, practical guide aligned with RO‑Crate 1.2 and GigaDB workflows. Addressing the above nits and the few correctness items (broken links, DOI mismatch, non‑standard terms) will make it production‑ready.

    📜 Review details

    Configuration used: CodeRabbit UI
    Review profile: CHILL
    Plan: Pro

    💡 Knowledge Base configuration:

    • MCP integration is disabled by default for public repositories
    • Jira integration is disabled by default for public repositories
    • Linear integration is disabled by default for public repositories

    You can enable these sources in your CodeRabbit configuration.

    📥 Commits

    Reviewing files that changed from the base of the PR and between e06f07f and 2f9fa5e.

    📒 Files selected for processing (1)
    • Guide.md (1 hunks)
    🧰 Additional context used
    🪛 LanguageTool
    Guide.md

    [grammar] ~2-~2: There might be a mistake here.
    Context: # Usage Guide for ro-crate-php Below are some notes to pay attention to...

    (QB_NEW_EN)


    [grammar] ~7-~7: There might be a mistake here.
    Context: ...ndard of RO-Crate 1.2. --- ## Overview This is a PHP tool to create and manipul...

    (QB_NEW_EN)


    [grammar] ~8-~8: There might be a mistake here.
    Context: ... not be perfect but ought to be able to provide sufficient description of the dataset. ...

    (QB_NEW_EN)


    [grammar] ~10-~10: There might be a mistake here.
    Context: ...he repository above. Version: [1.0] Last Updated: [2025-08-18] --- ## No...

    (QB_NEW_EN)


    [grammar] ~22-~22: There might be a mistake here.
    Context: ... same ID has to be created. In addition, name of an entity should be human-reada...

    (QB_NEW_EN)


    [grammar] ~24-~24: There might be a mistake here.
    Context: ...e-preview.json as the @id and filename. In detached package, i.e. the metadata fil...

    (QB_NEW_EN)


    [grammar] ~28-~28: There might be a mistake here.
    Context: ...x can be the dataset ID. --- ## Step 1 - Initialization of the Crate: Create th...

    (QB_NEW_EN)


    [grammar] ~29-~29: There might be a mistake here.
    Context: ...6](https://gigadb.org/dataset/102736)**. - Specification of the Components: Speci...

    (QB_NEW_EN)


    [grammar] ~30-~30: There might be a mistake here.
    Context: ...ecify the ID of the files, dataset such as zip file using hasPart, possibly using ...

    (QB_NEW_EN)


    [style] ~30-~30: Try moving the adverb to make the sentence clearer.
    Context: ...asPart, possibly using the # directory construct to collectively describe many files. Refer to Step 2 ...

    (SPLIT_INFINITIVE)


    [grammar] ~30-~30: There might be a mistake here.
    Context: ...ly any entities derived from them. Note that metadata file and the preview file, if ...

    (QB_NEW_EN)


    [grammar] ~30-~30: There might be a mistake here.
    Context: ...lly treated and not included in hasPart. - Specification of the License: Specify ...

    (QB_NEW_EN)


    [grammar] ~35-~35: There might be a mistake here.
    Context: ...ation used with different file entities. - Specification of Citation: In case the...

    (QB_NEW_EN)


    [grammar] ~36-~36: There might be a mistake here.
    Context: ...as to be a URL (for example a DOI URL). In case of citing another dataset/crate, t...

    (QB_NEW_EN)


    [grammar] ~39-~39: There might be a mistake here.
    Context: ...n string that concatenates all keywords with comma as the delimiter. As a special co...

    (QB_NEW_EN)


    [grammar] ~39-~39: There might be a mistake here.
    Context: ...pecial construct, together with the use of keywords property, we have to specify t...

    (QB_NEW_EN)


    [grammar] ~42-~42: There might be a mistake here.
    Context: ...attributes using the exifData construct. - Directory/Dataset/zip file: Create a D...

    (QB_NEW_EN)


    [grammar] ~43-~43: There might be a mistake here.
    Context: ...ve ID, which has to be an absolute URI. Such URI should resolve to a listing of the ...

    (QB_NEW_EN)


    [grammar] ~43-~43: There might be a mistake here.
    Context: ...attributes using the exifData construct.. - Collective Construct with #: In case ...

    (QB_NEW_EN)


    [grammar] ~49-~49: There might be a mistake here.
    Context: ... searched or recorded online. ## Step 4 - Thumbnail Handling: When the thumbnail...

    (QB_NEW_EN)


    [grammar] ~52-~52: There might be a mistake here.
    Context: ...ntity with the respective ID. ## Step 5 - Publisher and sdPublisher Handling: Cr...

    (QB_NEW_EN)


    [grammar] ~55-~55: There might be a mistake here.
    Context: ...the contact of the publisher. ## Step 6 - Identifier Handling: Create a contextu...

    (QB_NEW_EN)


    [grammar] ~56-~56: There might be a mistake here.
    Context: ...n the ID of the identifier being a doi. In case of a doi's ID of **[https://doi.or...

    (QB_NEW_EN)


    [grammar] ~58-~58: There might be a mistake here.
    Context: ... to the ID of the identifier. ## Step 7 - exifData Handling: Create a contextual...

    (QB_NEW_EN)


    [style] ~63-~63: This phrase might be redundant. Consider either removing or replacing the adjective ‘additional’.
    Context: ...pe CreativeWork with the respective ID, add an additional type of Dataset. Then, set the property conf...

    (ADD_AN_ADDITIONAL)


    [grammar] ~63-~63: Ensure spelling is correct
    Context: ...perties for the entity representing the another crate, since its content and further me...

    (QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


    [grammar] ~63-~63: There might be a mistake here.
    Context: ...from its own RO-Crate Metadata Document. -- A Paper: Create a Publication ent...

    (QB_NEW_EN)


    [grammar] ~67-~67: There might be a mistake here.
    Context: ...to an Organization entity. In case that such entity does not exist yet, we create a ...

    (QB_NEW_EN)


    [grammar] ~69-~69: There might be a mistake here.
    Context: ...ttps://ror.org/01ej9dk98)**. ## Step 10 - Funder Handling: Create an Organizatio...

    (QB_NEW_EN)


    [grammar] ~72-~72: There might be a mistake here.
    Context: ...he Awardee and the Award ID. ## Step 11 - About Handling: If the respective ID i...

    (QB_NEW_EN)

    🪛 markdownlint-cli2 (0.17.2)
    Guide.md

    53-53: Bare URL used

    (MD034, no-bare-urls)

    🔇 Additional comments (3)
    Guide.md (3)

    35-36: Confirm custom terms in @context (“exifData”)

    exifData is not a schema.org term. Ensure your RO‑Crate @context defines this term (and any similar extensions) so JSON‑LD expands correctly.

    Would you like me to propose a minimal @context extension snippet to include “exifData” (and “cite‑as” if applicable) in your Descriptor?

    Also applies to: 58-60


    28-39: Refactor Step 1 bullets for clarity and ensure custom terms are defined

    Separated merged bullets, fixed grammar/typos, and clarified property definitions.

    File: Guide.md
    Lines: 28–39

    -## Step 1
    -- **Initialization of the Crate**: Create the empty crate, then set the profile to specify the context version and the root data entity ID.- **Initialization of the Root Data Entity**: Set ID, name, description, datePublished, i.e. the date of first publication, and sdDatePublished, i.e. the date on which the current structured data was generated or published. The dates are is ISO8601 standard, e.g. YYYY-MM-DD. For GigaDB, the dataset is most likely to be web-based, the ID has to be an absolute URI, e.g. **[https://gigadb.org/dataset/102736](https://gigadb.org/dataset/102736)**.
    -- **Specification of the Components**: Specify the ID of the files, dataset such as zip file using hasPart, possibly using the \# directory construct to collectively describe many files. Refer to **[<b>Step 2</b>](#step-2)** for handling the data entities of these files and datasets and potentially any entities derived from them. Note that metadata file and the preview file, if it exists, are specially treated and not included in hasPart.
    --  **Specification of the License**: Specify the ID of the license using license, e.g. *[https://creativecommons.org/publicdomain/zero/1.0/](https://creativecommons.org/publicdomain/zero/1.0/)* for the CCO v1.0 license. Refer to **[<b>Step 3</b>](#step-3)** for handling the contextual entity of the license.
    --  **Specification of the Thumbnail**: Specify the ID of the thumbnail using thumbnail. The ID is recommended to be the corresponding downloable png. Refer to **[<b>Step 4</b>](#step-4)** for handling the contextual entity of the thumbnail.
    --  **Specification of the Publisher and sdPublisher**: Specify the ID of the publisher and sdPublisher using publisher and sdPublisher, e.g. *[https://gigadb.org/](https://gigadb.org/)* for GigaDB being the publisher and sdPublisher. Refer to **[<b>Step 5</b>](#step-5)** for handling the contextual entity of the publisher and sdPublisher.
    --  **Specification of the Identifier and Cite-as**:  Specify the identifier of the crate using identifier as an @id. As a special construct, we also include the identifier one by one as a plain string using cite-as. The identifier should be chosen to be persistent and resolvable in this way from a URI, which is commonly possible for a GigaDB dataset that has its doi. For example, it can be **[https://doi.org/10.4225/59/59672c09f4a4b](https://doi.org/10.4225/59/59672c09f4a4b)**. Refer to **[<b>Step 6</b>](#step-6)** for handling the contextual entity of the identifier.
    --  **Specification of Extra/Additional Information**: In case there is metadata that cannot be precisely described using existing properties, there is a special construct for it. Specify an exifData using a local identifier such as \#extraInfo as an @id. Refer to **[<b>Step 7</b>](#step-7)** for handling the contextual entity of the exifData. In a GigaDB dataset, information of the root dataset including Dataset type , Additional information , Additional information , Additional information , Additional information , Additional information , Additional information , Additional information , Additional information , Github links , Github links , Github links , Github links , Accessions (data not in GigaDB) and History can be wrapped by this construct. Note that this construct also works for other entities, e.g. Awardee and Award ID used with the organization entity for the funder, or Extra Information used with different file entities.
    --  **Specification of Citation**: In case the dataset cites publications like other datasets or papers, we have to include this information by specifying the ID of the publication using citation. Note that the ID has to be a URL (for example a DOI URL). In case of citing another dataset/crate, the ID should be chosen to be the @id value of the identifier property of that crate instead of the actual ID of that crate. Refer to **[<b>Step 8</b>](#step-8)** for handling the contextual entity of the citation of the publication.
    --   **Specification of Authors**: Specify the IDs of the author(s) one by one using author. For a GigaDB dataset, ORCID is usually picked as the ID for an author. For example, it may be **[https://orcid.org/0000-0001-9083-6757](https://orcid.org/0000-0001-9083-6757)**. Refer to **[<b>Step 9</b>](#step-9)** for handling the contextual entity of each of the author(s).
    --  **Specification of Funders**: Here the assumption that no information about an explicit associated research project is present is made. Specify the ID of the funder using funder, which happens to be the case for some of the GigaDB datasets. For a gigaDB dataset, the ID is often selected to be a ror, for instance, **[https://ror.org/011kf5r70](https://ror.org/011kf5r70)**. Refer to **[<b>Step 10</b>](#step-10)** for handling the contextual entity of the funder.
    --  **Specification of Keywords**: Specify the keyword(s) of the root dataset using keywords as a plain string that concatenates all keywords with comma as the delimiter. As a special construct, together with the use of keywords property, we have to specify the IDs of these keyword(s) one by one using about as @id's. Such ID is usually a url that explains the corresponding keyword, for example, **[https://nanoporetech.com/](https://nanoporetech.com/)** for the keyword of oxford nanopore technologies. Refer to **[<b>Step 11</b>](#step-11)** for handling the contextual entity of the about property.
    +## Step 1
    +- **Initialization of the crate**: Create an empty crate, then set the profile (context version) and the root data entity @id.
    +- **Initialization of the root data entity**: Set @id, name, description, datePublished (date of first publication), and sdDatePublished (date the current structured data was generated/published). Dates must follow ISO 8601 (e.g., YYYY-MM-DD). For GigaDB, datasets are typically web-based; use an absolute URI as the @id (e.g., **https://gigadb.org/dataset/102736**).
    +- **Specification of the components**: Specify file/dataset @ids using hasPart (e.g., zip files), optionally with the “#directory” construct to group many files. See **[Step 2](#step-2)**. Metadata and preview files are treated specially and excluded from hasPart.
    +- **Specification of the license**: Use license with the license @id (e.g., *https://creativecommons.org/publicdomain/zero/1.0/* for CC0 1.0). See **[Step 3](#step-3)**.
    +- **Specification of the thumbnail**: Use thumbnail with the downloadable PNG URL. See **[Step 4](#step-4)**.
    +- **Specification of the publisher and sdPublisher**: Use publisher and sdPublisher @ids (e.g., *https://gigadb.org/*). See **[Step 5](#step-5)**.
    +- **Specification of the identifier and cite-as**: Use identifier as an @id (JSON-LD node) and optionally cite-as for plain-string identifiers. Identifiers should be persistent/resolvable (e.g., DOI URLs like **https://doi.org/10.4225/59/59672c09f4a4b**). See **[Step 6](#step-6)**.
    +- **Specification of extra/additional information**: For metadata not covered by existing properties, use exifData with a local @id (e.g., “#extraInfo”). See **[Step 7](#step-7)**.
    +- **Specification of citation**: Use citation with the cited work’s URL @id (e.g., DOI). Prefer the identifier @id when citing another crate. See **[Step 8](#step-8)**.
    +- **Specification of authors**: Use author @ids (e.g., ORCID URLs like **https://orcid.org/0000-0001-9083-6757**). See **[Step 9](#step-9)**.
    +- **Specification of funders**: Use funder @ids (often ROR URLs like **https://ror.org/011kf5r70**). See **[Step 10](#step-10)**.
    +- **Specification of keywords**: Use keywords as a comma-separated string and about for concept URLs (e.g., **https://nanoporetech.com/** for “Oxford Nanopore Technologies”). See **[Step 11](#step-11)**.

    Note

    • cite-as is a standard property in the RO-Crate 1.2 context and requires no manual @context entry.
    • exifData is not in the default context—please ensure it’s defined in your @context or replace it with a supported property.

    62-65: Improve “Citation Handling” formatting and clarify @type usage

    Revise Guide.md (lines 62–65) to use nested bullets, correct grammar, and explicitly show JSON-LD @type syntax:

    - **Citation Handling**: We will discuss the two cases when the publication is another dataset and a paper.
    - **Another Dataset/Crate**: Create a Publication entity of type CreativeWork with the respective ID, add an additional type of Dataset. Then, set the property conformsTo to be the version-less generic RO-Crate profile **[https://w3id.org/ro/crate](https://w3id.org/ro/crate)**. Note that we do not set hasPart and usually other properties for the entity representing the another crate, since its content and further metadata is available from its own RO-Crate Metadata Document.
    - **A Paper**: Create a Publication entity of type ScholarlyArticle with the respective ID. then set the name. Also, set the author, identifier, issn, journal, datePublished and creditText, if any. Note that author can has more than one value and datePublished should be in ISO 8601 format.
    + **Citation Handling**: Two common cases—citing another dataset/crate or citing a paper.
    +  - **Another dataset/crate**:  
    +    • Create a Publication entity with  
    +      – @id: _your identifier_  
    +      – @type: `["CreativeWork","Dataset"]`  
    +    • Set `conformsTo` to `https://w3id.org/ro/crate`.  
    +    • Do _not_ set `hasPart` or duplicate its internal metadata; that lives in its own RO-Crate.
    +  - **A paper**:  
    +    • Create a Publication entity with  
    +      – @id: _your identifier_  
    +      – @type: `"ScholarlyArticle"`  
    +    • Then set:  
    +      – `name`  
    +      – `author` (can have multiple values)  
    +      – `identifier`  
    +      – `issn`  
    +      – `isPartOf` (journal)  
    +      – `datePublished` (ISO 8601)  
    +      – `creditText` (if any)

    Please confirm that your implementation supports writing array values for @type when multiple types are required.

    Comment on lines +42 to +44
    - **File**: Create a File entity with the respective ID, which has to be an absolute URI for a web-based entity. For a GigaDB dataset, it is most likely web-based, the ID is often selected to be the url that directly downloads the file. Then, we set the name, contentSize and encodingFormat. Note that the contentSize is either in kB or MB. Also, note that the encodingFormat is a plain string xxx/yyy, for instance, text/csv. In some cases that a more informative encodingFormat of the form xxx/yyy followed by a **[PRONOM](https://www.nationalarchives.gov.uk/PRONOM/Default.aspx)** identifier, for example, ["application/pdf", {"@id": **["https://www.nationalarchives.gov.uk/PRONOM/fmt/19"]("https://www.nationalarchives.gov.uk/PRONOM/fmt/19")**}]. Additionally, we can include some extra information including data types and file attributes using the exifData construct.
    - **Directory/Dataset/zip file**: Create a Dataset entity with the respective ID, which has to be an absolute URI. Such URI should resolve to a listing of the content of the directory/dataset/zip file. For a GigaDB dataset, it is most likely a web-based zip file, the ID is often selected to be the url that shows its description, for example, **[https://gigadb.org/dataset/view/id/102736/Files_page/4](https://gigadb.org/dataset/view/id/102736/Files_page/4)**. Then, we set the name, description, distribution and releaseDate. Note that the distribution is the url that downloads the content, for example, **[https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/BoostNano-master.zip](http://127.0.0.1:5501/assets/ro-crate-preview.html#https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/BoostNano-master.zip)**. Also, note that the releaseDate should be in the ISO 8601 format. Furthermore, we can include some extra information including data types and file attributes using the exifData construct..
    - **Collective Construct with \#**: In case that we prefer describing some files or/and directories collectively, we create a Dataset entity with a local identifier as the ID, for example, \#other-files. Then, we set the name and description.
    Copy link

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    🛠️ Refactor suggestion

    ⚠️ Potential issue

    Fix JSON example and broken link; minor grammar

    The PRONOM example formatting is broken and the distribution link points to localhost with a hash fragment.

    -- **File**: Create a File entity with the respective ID, which has to be an absolute URI for a web-based entity. For a GigaDB dataset, it is most likely web-based, the ID is often selected to be the url that directly downloads the file. Then, we set the name, contentSize and encodingFormat. Note that the contentSize is either in kB or MB. Also, note that the encodingFormat is a plain string xxx/yyy, for instance, text/csv. In some cases that a more informative encodingFormat of the form xxx/yyy followed by a **[PRONOM](https://www.nationalarchives.gov.uk/PRONOM/Default.aspx)** identifier, for example, ["application/pdf", {"@id": **["https://www.nationalarchives.gov.uk/PRONOM/fmt/19"]("https://www.nationalarchives.gov.uk/PRONOM/fmt/19")**}]. Additionally, we can include some extra information including data types and file attributes using the exifData construct.
    +- **File**: Create a File entity with an absolute URI @id (for GigaDB, typically a direct download URL). Then set name, contentSize, and encodingFormat. encodingFormat is a MIME type string (e.g., text/csv). If needed, you can pair it with a PRONOM identifier:
    +  - encodingFormat: "application/pdf"
    +  - encodingFormatIRI: https://www.nationalarchives.gov.uk/PRONOM/fmt/19
    +  Additionally, include extra information (e.g., data types, file attributes) using the exifData construct.
     - **Directory/Dataset/zip file**: Create a Dataset entity with the respective ID, which has to be an absolute URI. Such URI should resolve to a listing of the content of the directory/dataset/zip file. For a GigaDB dataset, it is most likely a web-based zip file, the ID is often selected to be the url that shows its description, for example, **[https://gigadb.org/dataset/view/id/102736/Files_page/4](https://gigadb.org/dataset/view/id/102736/Files_page/4)**. Then, we set the name, description, distribution and releaseDate. Note that the distribution is the url that downloads the content, for example, **[https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/BoostNano-master.zip](http://127.0.0.1:5501/assets/ro-crate-preview.html#https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/BoostNano-master.zip)**. Also, note that the releaseDate should be in the ISO 8601 format. Furthermore, we can include some extra information including data types and file attributes using the exifData construct..
    +- **Directory/Dataset/zip file**: Create a Dataset entity with an absolute URI @id that resolves to a listing of the content. For GigaDB, the @id is often the dataset page (e.g., **https://gigadb.org/dataset/view/id/102736/Files_page/4**). Then set name, description, distribution, and releaseDate. distribution should be the direct download URL (e.g., **https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/BoostNano-master.zip**). releaseDate must be ISO 8601. You may also include extra information using the exifData construct.
     - **Collective Construct with \#**: In case that we prefer describing some files or/and directories collectively, we create a Dataset entity with a local identifier as the ID, for example, \#other-files. Then, we set the name and description.
    📝 Committable suggestion

    ‼️ IMPORTANT
    Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

    Suggested change
    - **File**: Create a File entity with the respective ID, which has to be an absolute URI for a web-based entity. For a GigaDB dataset, it is most likely web-based, the ID is often selected to be the url that directly downloads the file. Then, we set the name, contentSize and encodingFormat. Note that the contentSize is either in kB or MB. Also, note that the encodingFormat is a plain string xxx/yyy, for instance, text/csv. In some cases that a more informative encodingFormat of the form xxx/yyy followed by a **[PRONOM](https://www.nationalarchives.gov.uk/PRONOM/Default.aspx)** identifier, for example, ["application/pdf", {"@id": **["https://www.nationalarchives.gov.uk/PRONOM/fmt/19"]("https://www.nationalarchives.gov.uk/PRONOM/fmt/19")**}]. Additionally, we can include some extra information including data types and file attributes using the exifData construct.
    - **Directory/Dataset/zip file**: Create a Dataset entity with the respective ID, which has to be an absolute URI. Such URI should resolve to a listing of the content of the directory/dataset/zip file. For a GigaDB dataset, it is most likely a web-based zip file, the ID is often selected to be the url that shows its description, for example, **[https://gigadb.org/dataset/view/id/102736/Files_page/4](https://gigadb.org/dataset/view/id/102736/Files_page/4)**. Then, we set the name, description, distribution and releaseDate. Note that the distribution is the url that downloads the content, for example, **[https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/BoostNano-master.zip](http://127.0.0.1:5501/assets/ro-crate-preview.html#https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/BoostNano-master.zip)**. Also, note that the releaseDate should be in the ISO 8601 format. Furthermore, we can include some extra information including data types and file attributes using the exifData construct..
    - **Collective Construct with \#**: In case that we prefer describing some files or/and directories collectively, we create a Dataset entity with a local identifier as the ID, for example, \#other-files. Then, we set the name and description.
    - **File**: Create a File entity with an absolute URI @id (for GigaDB, typically a direct download URL). Then set name, contentSize, and encodingFormat. encodingFormat is a MIME type string (e.g., text/csv). If needed, you can pair it with a PRONOM identifier:
    - encodingFormat: "application/pdf"
    - encodingFormatIRI: https://www.nationalarchives.gov.uk/PRONOM/fmt/19
    Additionally, include extra information (e.g., data types, file attributes) using the exifData construct.
    - **Directory/Dataset/zip file**: Create a Dataset entity with an absolute URI @id that resolves to a listing of the content. For GigaDB, the @id is often the dataset page (e.g., https://gigadb.org/dataset/view/id/102736/Files_page/4). Then set name, description, distribution, and releaseDate. distribution should be the direct download URL (e.g., https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/BoostNano-master.zip). releaseDate must be ISO 8601. You may also include extra information using the exifData construct.
    - **Collective Construct with \#**: In case that we prefer describing some files or/and directories collectively, we create a Dataset entity with a local identifier as the ID, for example, \#other-files. Then, we set the name and description.
    🧰 Tools
    🪛 LanguageTool

    [grammar] ~42-~42: There might be a mistake here.
    Context: ...attributes using the exifData construct. - Directory/Dataset/zip file: Create a D...

    (QB_NEW_EN)


    [grammar] ~43-~43: There might be a mistake here.
    Context: ...ve ID, which has to be an absolute URI. Such URI should resolve to a listing of the ...

    (QB_NEW_EN)


    [grammar] ~43-~43: There might be a mistake here.
    Context: ...attributes using the exifData construct.. - Collective Construct with #: In case ...

    (QB_NEW_EN)

    🤖 Prompt for AI Agents
    In Guide.md around lines 42 to 44, the PRONOM example JSON/Markdown formatting
    is broken and the distribution URL incorrectly points to localhost with a hash
    fragment; fix by correcting the PRONOM snippet to show the encodingFormat as a
    valid JSON-LD array (e.g. ["application/pdf", {"@id":
    "https://www.nationalarchives.gov.uk/PRONOM/fmt/19"}]) so the link is rendered
    properly, replace the localhost/disallowed hash-fragment distribution URL with
    the actual public download URL (remove the 127.0.0.1 and the "#..." fragment),
    and tidy minor grammar/typo issues (remove the duplicated period after exifData)
    so the paragraph reads cleanly.
    

    Comment on lines +56 to +57
    - **Identifier Handling**: Create a contextual entity of type PropertyValue with the respective ID, then set the propertyID, value and url. For example, the propertyID is **[https://registry.identifiers.org/registry/doi](https://registry.identifiers.org/registry/doi)** given the ID of the identifier being a doi. In case of a doi's ID of **[https://doi.org/10.4225/59/59672c09f4a4b](https://doi.org/10.4225/59/59672c09f4a4b)**, the value is set to be a plain string of doi:10.5524/102736. The url is often chosen to be identical to the ID of the identifier.

    Copy link

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    ⚠️ Potential issue

    Fix DOI example inconsistency and capitalization

    The example DOI value mismatches the DOI URL; use the same DOI stem and capitalize “URL”.

    -- **Identifier Handling**: Create a contextual entity of type PropertyValue with the respective ID, then set the propertyID, value and url. For example, the propertyID is **[https://registry.identifiers.org/registry/doi](https://registry.identifiers.org/registry/doi)** given the ID of the identifier being a doi. In case of a doi's ID of **[https://doi.org/10.4225/59/59672c09f4a4b](https://doi.org/10.4225/59/59672c09f4a4b)**, the value is set to be a plain string of doi:10.5524/102736. The url is often chosen to be identical to the ID of the identifier.
    +- **Identifier Handling**: Create a contextual entity of type PropertyValue with the respective @id, then set propertyID, value, and URL. For example, propertyID can be **https://registry.identifiers.org/registry/doi** when the identifier is a DOI. If the identifier @id is **https://doi.org/10.4225/59/59672c09f4a4b**, set value to the plain string `doi:10.4225/59/59672c09f4a4b`. The URL is typically identical to the identifier’s @id.
    📝 Committable suggestion

    ‼️ IMPORTANT
    Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

    Suggested change
    - **Identifier Handling**: Create a contextual entity of type PropertyValue with the respective ID, then set the propertyID, value and url. For example, the propertyID is **[https://registry.identifiers.org/registry/doi](https://registry.identifiers.org/registry/doi)** given the ID of the identifier being a doi. In case of a doi's ID of **[https://doi.org/10.4225/59/59672c09f4a4b](https://doi.org/10.4225/59/59672c09f4a4b)**, the value is set to be a plain string of doi:10.5524/102736. The url is often chosen to be identical to the ID of the identifier.
    - **Identifier Handling**: Create a contextual entity of type PropertyValue with the respective @id, then set propertyID, value, and URL. For example, propertyID can be **https://registry.identifiers.org/registry/doi** when the identifier is a DOI. If the identifier @id is **https://doi.org/10.4225/59/59672c09f4a4b**, set value to the plain string `doi:10.4225/59/59672c09f4a4b`. The URL is typically identical to the identifier’s @id.
    🧰 Tools
    🪛 LanguageTool

    [grammar] ~56-~56: There might be a mistake here.
    Context: ...n the ID of the identifier being a doi. In case of a doi's ID of **[https://doi.or...

    (QB_NEW_EN)

    🤖 Prompt for AI Agents
    In Guide.md around lines 56-57, the DOI example has a mismatched DOI value and
    uses lowercase "url"; update the example so the identifier URL and the DOI value
    stem match (change the value to doi:10.4225/59/59672c09f4a4b to match
    https://doi.org/10.4225/59/59672c09f4a4b) and capitalize "url" to "URL" in the
    explanatory text.
    

    Copy link

    @coderabbitai coderabbitai bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Actionable comments posted: 4

    ♻️ Duplicate comments (2)
    Guide.md (2)

    42-44: Fix PRONOM example formatting and broken distribution link

    JSON/Markdown is malformed; and the distribution link targets localhost with a hash fragment.

    Apply this diff:

    -- **File**: Create a File entity with the respective ID, which has to be an absolute URI for a web-based entity. For a GigaDB dataset, it is most likely web-based, the ID is often selected to be the url that directly downloads the file. Then, we set the name, contentSize and encodingFormat. Note that the contentSize is either in kB or MB. Also, note that the encodingFormat is a plain string xxx/yyy, for instance, text/csv. In some cases that a more informative encodingFormat of the form xxx/yyy followed by a **[PRONOM](https://www.nationalarchives.gov.uk/PRONOM/Default.aspx)** identifier, for example, ["application/pdf", {"@id": **["https://www.nationalarchives.gov.uk/PRONOM/fmt/19"]("https://www.nationalarchives.gov.uk/PRONOM/fmt/19")**}]. Additionally, we can include some extra information including data types and file attributes using the exifData construct.
    +- **File**: Create a File entity with an absolute URI @id (for a web-based entity, this is typically a direct download URL). Then set name, contentSize, and encodingFormat. encodingFormat is a MIME type string (e.g., text/csv). If needed, you can pair it with a PRONOM identifier, e.g.:
    +  `["application/pdf", {"@id": "https://www.nationalarchives.gov.uk/PRONOM/fmt/19"}]`. Additionally, you may include extra information (e.g., data types and file attributes) using the exifData construct.
     - **Directory/Dataset/zip file**: Create a Dataset entity with the respective ID, which has to be an absolute URI. Such URI should resolve to a listing of the content of the directory/dataset/zip file. For a GigaDB dataset, it is most likely a web-based zip file, the ID is often selected to be the url that shows its description, for example, **[https://gigadb.org/dataset/view/id/102736/Files_page/4](https://gigadb.org/dataset/view/id/102736/Files_page/4)**. Then, we set the name, description, distribution and releaseDate. Note that the distribution is the url that downloads the content, for example, **[https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/BoostNano-master.zip](http://127.0.0.1:5501/assets/ro-crate-preview.html#https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/BoostNano-master.zip)**. Also, note that the releaseDate should be in the ISO 8601 format. Furthermore, we can include some extra information including data types and file attributes using the exifData construct..
    +- **Directory/Dataset/zip file**: Create a Dataset entity with the respective ID, which has to be an absolute URI. Such a URI should resolve to a listing of the content of the directory/dataset/zip file. For a GigaDB dataset, it is most likely a web-based zip file; the ID is often selected to be the URL that shows its description (e.g., **https://gigadb.org/dataset/view/id/102736/Files_page/4**). Then set name, description, distribution, and releaseDate. distribution should be the direct download URL (e.g., **https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/BoostNano-master.zip**). Also, releaseDate should be in ISO 8601 format. Furthermore, you can include extra information (e.g., data types and file attributes) using the exifData construct.

    56-57: Fix DOI example inconsistency and capitalize “URL”

    Make the DOI value match the example DOI URL and use “URL”.

    Apply this diff:

    -- **Identifier Handling**: Create a contextual entity of type PropertyValue with the respective ID, then set the propertyID, value and url. For example, the propertyID is **[https://registry.identifiers.org/registry/doi](https://registry.identifiers.org/registry/doi)** given the ID of the identifier being a doi. In case of a doi's ID of **[https://doi.org/10.4225/59/59672c09f4a4b](https://doi.org/10.4225/59/59672c09f4a4b)**, the value is set to be a plain string of doi:10.5524/102736. The url is often chosen to be identical to the ID of the identifier.
    +- **Identifier Handling**: Create a contextual entity of type PropertyValue with the respective @id, then set propertyID, value, and URL. For example, propertyID can be **https://registry.identifiers.org/registry/doi** when the identifier is a DOI. If the identifier @id is **https://doi.org/10.4225/59/59672c09f4a4b**, set value to the plain string `doi:10.4225/59/59672c09f4a4b`. The URL is typically identical to the identifier’s @id.
    🧹 Nitpick comments (6)
    Guide.md (6)

    63-65: Tighten citation guidance; fix redundancy and wording

    Apply this diff:

    -- **Another Dataset/Crate**: Create a Publication entity of type CreativeWork with the respective ID, add an additional type of Dataset. Then, set the property conformsTo to be the version-less generic RO-Crate profile **[https://w3id.org/ro/crate](https://w3id.org/ro/crate)**. Note that we do not set hasPart and usually other properties for the entity representing the another crate, since its content and further metadata is available from its own RO-Crate Metadata Document.
    +- **Another Dataset/Crate**: Create a Publication entity of type CreativeWork with the respective ID, and also add @type: Dataset. Then set conformsTo to the version-less generic RO-Crate profile **https://w3id.org/ro/crate**. Do not set hasPart (or most other properties) for the entity representing the other crate, since its content and metadata are available from its own RO‑Crate Metadata document.
    -- **A Paper**: Create a Publication entity of type ScholarlyArticle with the respective ID. then set the name. Also, set the author, identifier, issn, journal, datePublished and creditText, if any. Note that author can has more than one value and datePublished should be in ISO 8601 format.
    +- **A Paper**: Create a Publication entity of type ScholarlyArticle with the respective ID. Then set name. Also set author, identifier, issn, journal, datePublished, and creditText (if any). Note that author can have more than one value and datePublished should be in ISO 8601 format.

    67-68: Minor grammar and capitalization; prefer ROR and “an Organization”

    Apply this diff:

    -- **Author Handling**: Create a Person entity with the respective ID, then set the affiliation and the name. The affiliation should refer to an Organization entity. In case that such entity does not exist yet, we create a Organization entity with the respective ID, then set the name, where the name may have to be searched or recorded online. For a GigaDB dataset, a ror is often picked as the ID for the organization, for instance, **[https://ror.org/01ej9dk98](https://ror.org/01ej9dk98)**.
    +- **Author Handling**: Create a Person entity with the respective ID, then set affiliation and name. The affiliation should refer to an Organization entity. If such an entity does not exist yet, create an Organization entity with the respective ID and set its name (you may need to source this). For a GigaDB dataset, a ROR is often used as the organization’s ID (e.g., **https://ror.org/01ej9dk98**).

    72-73: Fix article usage: “an url” → “a URL”

    Apply this diff:

    -- **About Handling**: If the respective ID is an url, we create a contextual entity of type URL with the respective ID and set the name of the entity.
    +- **About Handling**: If the respective ID is a URL, create a contextual entity of type URL with the respective ID and set the entity’s name.

    8-8: Tighten overview phrasing and link directly to README

    Apply this diff:

    -This is a PHP tool to create and manipulate Research Object Crate. Please refer to the repository's *[README.md](https://github.com/gigascience/ro-crate-php/tree/main)* for more details. Below are the high-level steps instructing the creation of the metadata file for a GigaDB dataset from scratch. The created file may not be perfect but ought to be able to provide sufficient description of the dataset. An example created following the flow is in the assets directory of the repository above.
    +This is a PHP tool to create and manipulate Research Object Crates (RO‑Crate). See the repository *[README.md](https://github.com/gigascience/ro-crate-php/blob/main/README.md)* for details. Below are high‑level steps for creating the metadata file for a GigaDB dataset from scratch. The result may not be perfect but should sufficiently describe the dataset. An example created following this flow is in the repository’s assets directory.

    16-16: Clarify the @id vs literal rule and reference to API flags

    The “true/false flag” references the library API; make the guidance explicit and concise.

    Apply this diff:

    -The general rule is that we use the @id construct (<b>true</b> flag if using the add/removePropertyPair methods) when referring to another entity, we otherwise use a plain literal (<b>false</b> flag if using the add/removePropertyPair methods). There are exceptions for specific constructs not following the rules.
    +General rule: use an @id reference (pass <b>true</b> to add/removePropertyPair) when referring to another entity; otherwise use a plain literal (pass <b>false</b>). Some constructs are exceptions to this rule.

    If helpful, I can add a short example snippet showing both forms using your PHP API.


    20-23: Make uniqueness and naming guidance precise

    Apply this diff:

    -Also, only one entity with the same ID has to be created.
    -
    -In addition, name of an entity should be human-readable if it exists.
    +Each @id must be unique within a crate; do not create more than one entity with the same @id.
    +
    +Entity names should be human‑readable when present.
    📜 Review details

    Configuration used: CodeRabbit UI
    Review profile: CHILL
    Plan: Pro

    💡 Knowledge Base configuration:

    • MCP integration is disabled by default for public repositories
    • Jira integration is disabled by default for public repositories
    • Linear integration is disabled by default for public repositories

    You can enable these sources in your CodeRabbit configuration.

    📥 Commits

    Reviewing files that changed from the base of the PR and between 2f9fa5e and 6796a72.

    📒 Files selected for processing (1)
    • Guide.md (1 hunks)
    🧰 Additional context used
    🪛 LanguageTool
    Guide.md

    [grammar] ~2-~2: There might be a mistake here.
    Context: # Usage Guide for ro-crate-php Below are some notes to pay attention to...

    (QB_NEW_EN)


    [grammar] ~7-~7: There might be a mistake here.
    Context: ...ndard of RO-Crate 1.2. --- ## Overview This is a PHP tool to create and manipul...

    (QB_NEW_EN)


    [grammar] ~8-~8: There might be a mistake here.
    Context: ... not be perfect but ought to be able to provide sufficient description of the dataset. ...

    (QB_NEW_EN)


    [grammar] ~10-~10: There might be a mistake here.
    Context: ...he repository above. Version: [1.0] Last Updated: [2025-08-18] --- ## No...

    (QB_NEW_EN)


    [grammar] ~22-~22: There might be a mistake here.
    Context: ... same ID has to be created. In addition, name of an entity should be human-reada...

    (QB_NEW_EN)


    [grammar] ~24-~24: There might be a mistake here.
    Context: ...e-preview.json as the @id and filename. In detached package, i.e. the metadata fil...

    (QB_NEW_EN)


    [grammar] ~28-~28: There might be a mistake here.
    Context: ...x can be the dataset ID. --- ## Step 1 - Initialization of the Crate: Create th...

    (QB_NEW_EN)


    [grammar] ~29-~29: There might be a mistake here.
    Context: ...6](https://gigadb.org/dataset/102736)**. - Specification of the Components: Speci...

    (QB_NEW_EN)


    [style] ~30-~30: Try moving the adverb to make the sentence clearer.
    Context: ...asPart, possibly using the # directory construct to collectively describe many files. Refer to Step 2 ...

    (SPLIT_INFINITIVE)


    [grammar] ~30-~30: There might be a mistake here.
    Context: ...ly any entities derived from them. Note that metadata file and the preview file, if ...

    (QB_NEW_EN)


    [grammar] ~30-~30: There might be a mistake here.
    Context: ...lly treated and not included in hasPart. - Specification of the License: Specify ...

    (QB_NEW_EN)


    [grammar] ~35-~35: There might be a mistake here.
    Context: ...ation used with different file entities. - Specification of Citation: In case the...

    (QB_NEW_EN)


    [grammar] ~36-~36: There might be a mistake here.
    Context: ...as to be a URL (for example a DOI URL). In case of citing another dataset/crate, t...

    (QB_NEW_EN)


    [grammar] ~39-~39: There might be a mistake here.
    Context: ...n string that concatenates all keywords with comma as the delimiter. As a special co...

    (QB_NEW_EN)


    [grammar] ~39-~39: There might be a mistake here.
    Context: ...pecial construct, together with the use of keywords property, we have to specify t...

    (QB_NEW_EN)


    [grammar] ~42-~42: There might be a mistake here.
    Context: ...attributes using the exifData construct. - Directory/Dataset/zip file: Create a D...

    (QB_NEW_EN)


    [grammar] ~43-~43: There might be a mistake here.
    Context: ...ve ID, which has to be an absolute URI. Such URI should resolve to a listing of the ...

    (QB_NEW_EN)


    [grammar] ~43-~43: There might be a mistake here.
    Context: ...attributes using the exifData construct.. - Collective Construct with #: In case ...

    (QB_NEW_EN)


    [grammar] ~49-~49: There might be a mistake here.
    Context: ... searched or recorded online. ## Step 4 - Thumbnail Handling: When the thumbnail...

    (QB_NEW_EN)


    [grammar] ~52-~52: There might be a mistake here.
    Context: ...ntity with the respective ID. ## Step 5 - Publisher and sdPublisher Handling: Cr...

    (QB_NEW_EN)


    [grammar] ~55-~55: There might be a mistake here.
    Context: ...the contact of the publisher. ## Step 6 - Identifier Handling: Create a contextu...

    (QB_NEW_EN)


    [grammar] ~56-~56: There might be a mistake here.
    Context: ...n the ID of the identifier being a doi. In case of a doi's ID of **[https://doi.or...

    (QB_NEW_EN)


    [grammar] ~58-~58: There might be a mistake here.
    Context: ... to the ID of the identifier. ## Step 7 - exifData Handling: Create a contextual...

    (QB_NEW_EN)


    [style] ~63-~63: This phrase might be redundant. Consider either removing or replacing the adjective ‘additional’.
    Context: ...pe CreativeWork with the respective ID, add an additional type of Dataset. Then, set the property conf...

    (ADD_AN_ADDITIONAL)


    [grammar] ~63-~63: Ensure spelling is correct
    Context: ...perties for the entity representing the another crate, since its content and further me...

    (QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


    [grammar] ~63-~63: There might be a mistake here.
    Context: ...from its own RO-Crate Metadata Document. -- A Paper: Create a Publication ent...

    (QB_NEW_EN)


    [grammar] ~67-~67: There might be a mistake here.
    Context: ...to an Organization entity. In case that such entity does not exist yet, we create a ...

    (QB_NEW_EN)


    [grammar] ~69-~69: There might be a mistake here.
    Context: ...ttps://ror.org/01ej9dk98)**. ## Step 10 - Funder Handling: Create an Organizatio...

    (QB_NEW_EN)


    [grammar] ~72-~72: There might be a mistake here.
    Context: ...he Awardee and the Award ID. ## Step 11 - About Handling: If the respective ID i...

    (QB_NEW_EN)

    🪛 markdownlint-cli2 (0.17.2)
    Guide.md

    53-53: Bare URL used

    (MD034, no-bare-urls)

    Copy link

    @coderabbitai coderabbitai bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Actionable comments posted: 1

    ♻️ Duplicate comments (6)
    Guide.md (6)

    24-24: Clarify detached-package wording; ensure @id matches filename

    Aligns with RO‑Crate wording and makes the rename explicit.

    -The metadata file always has ro-crate-metadata.json as the @id. The preview file has ro-crate-preview.html as the @id and filename. In detached package, i.e. the metadata file is not within the package, which is most likely for GigaDB, the filename ro-crate-metadata.json is renamed to xxxx-ro-crate-metadata.json, e.g. xxxx can be the dataset ID.
    +The metadata document always has ro-crate-metadata.json as its @id (and filename). The preview file has ro-crate-preview.html as its @id (and filename). In a detached package (i.e., when the metadata file is not included in the package)—common for GigaDB—the filename ro-crate-metadata.json is renamed to <datasetId>-ro-crate-metadata.json (ensure the @id matches the filename).

    29-30: Split merged bullets; fix ISO 8601 wording and article usage

    Two bullets are accidentally merged; also improve date wording and the “metadata file” article.

    -- **Initialization of the Crate**: Create the empty crate, then set the profile to specify the context version and the root data entity ID.- **Initialization of the Root Data Entity**: Set ID, name, description, datePublished, i.e. the date of first publication, and sdDatePublished, i.e. the date on which the current structured data was generated or published. The dates are is ISO 8601 standard, e.g. YYYY-MM-DD. For GigaDB, the dataset is most likely to be web-based, the ID has to be an absolute URI, e.g. **[https://gigadb.org/dataset/102736](https://gigadb.org/dataset/102736)**.
    +- **Initialization of the Crate**: Create the empty crate, then set the profile to specify the context version and the root data entity @id.
    +
    +- **Initialization of the Root Data Entity**: Set @id, name, description, datePublished (date of first publication), and sdDatePublished (date the structured data was generated/published). Dates must be in ISO 8601 format (e.g., YYYY‑MM‑DD). For GigaDB, the dataset is typically web‑based; the @id must be an absolute URI (e.g., https://gigadb.org/dataset/102736).
    -- **Specification of the Components**: Specify the ID of the files, dataset such as zip file using hasPart, possibly using the \# directory construct to collectively describe many files. Refer to **[<b>Step 2</b>](#step-2)** for handling the data entities of these files and datasets and potentially any entities derived from them. Note that metadata file and the preview file, if it exists, are specially treated and not included in hasPart.
    +- **Specification of the Components**: Specify the ID of the files and datasets (e.g., a ZIP) using hasPart, possibly using the \# directory construct to collectively describe many files. Refer to **[<b>Step 2</b>](#step-2)** for handling the data entities of these files/datasets and any entities derived from them. Note that the metadata file and the preview file (if present) are specially treated and not included in hasPart.

    32-32: Fix typo: “downloable png” → “downloadable PNG”

    --  **Specification of the Thumbnail**: Specify the ID of the thumbnail using thumbnail. The ID is recommended to be the corresponding downloable PNG. Refer to **[<b>Step 4</b>](#step-4)** for handling the contextual entity of the thumbnail.
    +-  **Specification of the Thumbnail**: Specify the ID of the thumbnail using thumbnail. The ID is recommended to be the corresponding downloadable PNG. Refer to **[<b>Step 4</b>](#step-4)** for handling the contextual entity of the thumbnail.

    53-53: Tighten Organization/contact guidance; improve clarity

    Streamline the phrasing and use concise property names; keep mailto: example.

    -- **Publisher and sdPublisher Handling**: Create an Organization entity with the respective ID, then set the name and description of the organization, where the name and the description may have to be searched or recorded online. Also, set the contactPoint with usually the email following *mailto:*, e.g. **[mailto:database@gigasciencejournal.com](mailto:database@gigasciencejournal.com)**. Then, create a contactPoint entity with this respective ID, and set the contactType, email and identifier. For the case of the example ID, the email and identifier can share a plain string database@gigasciencejournal.com, while the contactType may be a plain string saying the contact of the publisher.
    +- **Publisher and sdPublisher Handling**: Create an Organization entity with the given @id and set its name and description (these may need to be sourced). Set contactPoint to a mailto: URL (e.g., **mailto:database@gigasciencejournal.com**). Then create a ContactPoint entity with that @id and set contactType, email, and identifier. In the example, email and identifier can both be `database@gigasciencejournal.com`, while contactType can describe the publisher’s contact.

    56-57: Fix DOI example inconsistency; capitalize “URL”

    Make value consistent with the DOI @id and use “URL”.

    -- **Identifier Handling**: Create a contextual entity of type PropertyValue with the respective ID, then set the propertyID, value and url. For example, the propertyID is **[https://registry.identifiers.org/registry/doi](https://registry.identifiers.org/registry/doi)** given the ID of the identifier being a doi. In case of a doi's ID of **[https://doi.org/10.4225/59/59672c09f4a4b](https://doi.org/10.4225/59/59672c09f4a4b)**, the value is set to be a plain string of doi:10.5524/102736. The url is often chosen to be identical to the ID of the identifier.
    +- **Identifier Handling**: Create a contextual entity of type PropertyValue with the respective @id, then set propertyID, value, and URL. For example, propertyID can be **https://registry.identifiers.org/registry/doi** when the identifier is a DOI. If the identifier @id is **https://doi.org/10.4225/59/59672c09f4a4b**, set value to the plain string `doi:10.4225/59/59672c09f4a4b`. The URL is typically identical to the identifier’s @id.

    41-44: Fix PRONOM example formatting and broken distribution link; minor grammar

    The PRONOM snippet is malformed and the distribution link erroneously points to localhost with a hash fragment. Also standardize “URL”.

    -## Step 2
    -- **File**: Create a File entity with the respective ID, which has to be an absolute URI for a web-based entity. For a GigaDB dataset, it is most likely web-based, the ID is often selected to be the url that directly downloads the file. Then, we set the name, contentSize and encodingFormat. Note that the contentSize is either in kB or MB. Also, note that the encodingFormat is a plain string xxx/yyy, for instance, text/csv. In some cases that a more informative encodingFormat of the form xxx/yyy followed by a **[PRONOM](https://www.nationalarchives.gov.uk/PRONOM/Default.aspx)** identifier, for example, ["application/pdf", {"@id": **["https://www.nationalarchives.gov.uk/PRONOM/fmt/19"]("https://www.nationalarchives.gov.uk/PRONOM/fmt/19")**}]. Additionally, we can include some extra information including data types and file attributes using the exifData construct.
    -- **Directory/Dataset/zip file**: Create a Dataset entity with the respective ID, which has to be an absolute URI. Such URI should resolve to a listing of the content of the directory/dataset/zip file. For a GigaDB dataset, it is most likely a web-based zip file, the ID is often selected to be the url that shows its description, for example, **[https://gigadb.org/dataset/view/id/102736/Files_page/4](https://gigadb.org/dataset/view/id/102736/Files_page/4)**. Then, we set the name, description, distribution and releaseDate. Note that the distribution is the url that downloads the content, for example, **[https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/BoostNano-master.zip](http://127.0.0.1:5501/assets/ro-crate-preview.html#https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/BoostNano-master.zip)**. Also, note that the releaseDate should be in the ISO 8601 format. Furthermore, we can include some extra information including data types and file attributes using the exifData construct..
    +-## Step 2
    +- **File**: Create a File entity with an absolute URI @id (for web‑based entities, typically a direct download URL). Then set name, contentSize, and encodingFormat. encodingFormat is a MIME type string (e.g., text/csv). If needed, you can pair it with a PRONOM identifier:
    +  - encodingFormat: "application/pdf"
    +  - encodingFormatIRI: https://www.nationalarchives.gov.uk/PRONOM/fmt/19
    +  Additionally, you can include extra information (e.g., data types, file attributes) using the exifData construct.
    +- **Directory/Dataset/ZIP file**: Create a Dataset entity with an absolute URI @id that resolves to a listing of the content. For GigaDB, the @id is often the dataset page (e.g., **https://gigadb.org/dataset/view/id/102736/Files_page/4**). Then set name, description, distribution, and releaseDate. distribution should be the direct download URL (e.g., **https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/BoostNano-master.zip**). releaseDate must be ISO 8601. You may also include extra information (e.g., data types and file attributes) using the exifData construct.
    🧹 Nitpick comments (7)
    Guide.md (7)

    8-8: Tighten Overview wording; fix README link path

    Pluralize “Research Object Crate,” simplify phrasing, and point the README link directly to the file.

    -This is a PHP tool to create and manipulate Research Object Crate. Please refer to the repository's *[README.md](https://github.com/gigascience/ro-crate-php/tree/main)* for more details. Below are the high-level steps instructing the creation of the metadata file for a GigaDB dataset from scratch. The created file may not be perfect but ought to be able to provide sufficient description of the dataset. An example created following the flow is in the assets directory of the repository above.
    +This is a PHP tool to create and manipulate Research Object Crates. Please refer to the repository’s *[README.md](https://github.com/gigascience/ro-crate-php/tree/main/README.md)* for more details. Below are high‑level steps for creating the metadata file for a GigaDB dataset from scratch. The created file may not be perfect but should provide a sufficient description of the dataset. An example created following the flow is in the repository’s assets directory.

    22-22: Grammar: “name of an entity” → “the name of an entity”

    Minor readability improvement.

    -In addition, name of an entity should be human-readable if it exists.
    +In addition, the name of an entity should be human‑readable (when present).

    38-38: Capitalize “GigaDB”

    Maintain consistent proper-noun capitalization.

    --  **Specification of Funders**: Here the assumption that no information about an explicit associated research project is present is made. Specify the ID of the funder using funder, which happens to be the case for some of the GigaDB datasets. For a gigaDB dataset, the ID is often selected to be a ror, for instance, **[https://ror.org/011kf5r70](https://ror.org/011kf5r70)**. Refer to **[<b>Step 10</b>](#step-10)** for handling the contextual entity of the funder.
    +-  **Specification of Funders**: Here the assumption that no information about an explicit associated research project is present is made. Specify the ID of the funder using funder, which happens to be the case for some of the GigaDB datasets. For a GigaDB dataset, the ID is often selected to be a ROR, for instance, **[https://ror.org/011kf5r70](https://ror.org/011kf5r70)**. Refer to **[<b>Step 10</b>](#step-10)** for handling the contextual entity of the funder.

    39-39: Clarify keywords guidance; capitalize URL and refer to @id values

    Improve clarity and casing.

    --  **Specification of Keywords**: Specify the keyword(s) of the root dataset using keywords as a plain string that concatenates all keywords with comma as the delimiter. As a special construct, together with the use of keywords property, we have to specify the IDs of these keyword(s) one by one using about as @id's. Such ID is usually a url that explains the corresponding keyword, for example, **[https://nanoporetech.com/](https://nanoporetech.com/)** for the keyword of oxford nanopore technologies. Refer to **[<b>Step 11</b>](#step-11)** for handling the contextual entity of the about property.
    +-  **Specification of Keywords**: Provide keywords as a comma‑separated string via the keywords property. Additionally, specify the IDs of these keywords one by one using about as @id values. Each @id is usually a URL that explains the keyword (e.g., **https://nanoporetech.com/** for “Oxford Nanopore Technologies”). Refer to **[<b>Step 11</b>](#step-11)** for handling the contextual entity of about.

    50-50: Subject–verb agreement and style

    Use imperative voice consistent with other steps.

    -- **Thumbnail Handling**: When the thumbnail is incidental to the root dataset, usually the case, we do not include it in the hasPart of the root data entity and creates a File entity with the respective ID.
    +- **Thumbnail Handling**: When the thumbnail is incidental to the root dataset (usually the case), do not include it in the root data entity’s hasPart; create a File entity with the respective @id instead.

    63-65: Tighten citation wording; fix redundancy and grammar

    Remove “add an additional,” fix “the another,” and minor grammar.

    -- **Another Dataset/Crate**: Create a Publication entity of type CreativeWork with the respective ID, add an additional type of Dataset. Then, set the property conformsTo to be the version-less generic RO-Crate profile **[https://w3id.org/ro/crate](https://w3id.org/ro/crate)**. Note that we do not set hasPart and usually other properties for the entity representing the another crate, since its content and further metadata is available from its own RO-Crate Metadata Document.
    --- **A Paper**: Create a Publication entity of type ScholarlyArticle with the respective ID. then set the name. Also, set the author, identifier, issn, journal, datePublished and creditText, if any. Note that author can has more than one value and datePublished should be in ISO 8601 format.
    +- **Another Dataset/Crate**: Create a Publication entity of type CreativeWork with the respective @id and include Dataset as an additional type (or add Dataset to @type). Then set conformsTo to the version‑less RO‑Crate profile **https://w3id.org/ro/crate**. Do not set hasPart or other properties for the referenced crate, since its content and metadata are available from its own RO‑Crate Metadata Document.
    +- **A Paper**: Create a Publication entity of type ScholarlyArticle with the respective @id, then set the name. Also set author, identifier, ISSN, journal, datePublished, and creditText (if any). Note that author can have more than one value and datePublished should be in ISO 8601 format.

    73-73: Article/casing: “an url” → “a URL”

    -- **About Handling**: If the respective ID is an url, we create a contextual entity of type URL with the respective ID and set the name of the entity.
    +- **About Handling**: If the respective ID is a URL, create a contextual entity of type URL with the respective ID and set the name of the entity.
    📜 Review details

    Configuration used: CodeRabbit UI
    Review profile: CHILL
    Plan: Pro

    💡 Knowledge Base configuration:

    • MCP integration is disabled by default for public repositories
    • Jira integration is disabled by default for public repositories
    • Linear integration is disabled by default for public repositories

    You can enable these sources in your CodeRabbit configuration.

    📥 Commits

    Reviewing files that changed from the base of the PR and between 6796a72 and adff89d.

    📒 Files selected for processing (1)
    • Guide.md (1 hunks)
    🧰 Additional context used
    🪛 LanguageTool
    Guide.md

    [grammar] ~2-~2: There might be a mistake here.
    Context: # Usage Guide for ro-crate-php Below are some notes to pay attention to...

    (QB_NEW_EN)


    [grammar] ~7-~7: There might be a mistake here.
    Context: ...ndard of RO-Crate 1.2. --- ## Overview This is a PHP tool to create and manipul...

    (QB_NEW_EN)


    [grammar] ~8-~8: There might be a mistake here.
    Context: ... not be perfect but ought to be able to provide sufficient description of the dataset. ...

    (QB_NEW_EN)


    [grammar] ~10-~10: There might be a mistake here.
    Context: ...he repository above. Version: [1.0] Last Updated: [2025-08-18] --- ## No...

    (QB_NEW_EN)


    [grammar] ~22-~22: There might be a mistake here.
    Context: ... same ID has to be created. In addition, name of an entity should be human-reada...

    (QB_NEW_EN)


    [grammar] ~24-~24: There might be a mistake here.
    Context: ...e-preview.html as the @id and filename. In detached package, i.e. the metadata fil...

    (QB_NEW_EN)


    [grammar] ~28-~28: There might be a mistake here.
    Context: ...x can be the dataset ID. --- ## Step 1 - Initialization of the Crate: Create th...

    (QB_NEW_EN)


    [grammar] ~29-~29: There might be a mistake here.
    Context: ...6](https://gigadb.org/dataset/102736)**. - Specification of the Components: Speci...

    (QB_NEW_EN)


    [grammar] ~30-~30: There might be a mistake here.
    Context: ...ecify the ID of the files, dataset such as zip file using hasPart, possibly using ...

    (QB_NEW_EN)


    [style] ~30-~30: Try moving the adverb to make the sentence clearer.
    Context: ...asPart, possibly using the # directory construct to collectively describe many files. Refer to Step 2 ...

    (SPLIT_INFINITIVE)


    [grammar] ~30-~30: There might be a mistake here.
    Context: ...ly any entities derived from them. Note that metadata file and the preview file, if ...

    (QB_NEW_EN)


    [grammar] ~30-~30: There might be a mistake here.
    Context: ...lly treated and not included in hasPart. - Specification of the License: Specify ...

    (QB_NEW_EN)


    [grammar] ~32-~32: Ensure spelling is correct
    Context: ... is recommended to be the corresponding downloable PNG. Refer to **[Step 4](#step-4...

    (QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


    [grammar] ~35-~35: There might be a mistake here.
    Context: ...ation used with different file entities. - Specification of Citation: In case the...

    (QB_NEW_EN)


    [grammar] ~36-~36: There might be a mistake here.
    Context: ...as to be a URL (for example a DOI URL). In case of citing another dataset/crate, t...

    (QB_NEW_EN)


    [grammar] ~39-~39: There might be a mistake here.
    Context: ...n string that concatenates all keywords with comma as the delimiter. As a special co...

    (QB_NEW_EN)


    [grammar] ~39-~39: There might be a mistake here.
    Context: ...pecial construct, together with the use of keywords property, we have to specify t...

    (QB_NEW_EN)


    [grammar] ~42-~42: There might be a mistake here.
    Context: ...attributes using the exifData construct. - Directory/Dataset/zip file: Create a D...

    (QB_NEW_EN)


    [grammar] ~43-~43: There might be a mistake here.
    Context: ...ve ID, which has to be an absolute URI. Such URI should resolve to a listing of the ...

    (QB_NEW_EN)


    [grammar] ~43-~43: There might be a mistake here.
    Context: ...attributes using the exifData construct.. - Collective Construct with #: In case ...

    (QB_NEW_EN)


    [grammar] ~49-~49: There might be a mistake here.
    Context: ... searched or recorded online. ## Step 4 - Thumbnail Handling: When the thumbnail...

    (QB_NEW_EN)


    [grammar] ~52-~52: There might be a mistake here.
    Context: ...ntity with the respective ID. ## Step 5 - Publisher and sdPublisher Handling: Cr...

    (QB_NEW_EN)


    [grammar] ~55-~55: There might be a mistake here.
    Context: ...the contact of the publisher. ## Step 6 - Identifier Handling: Create a contextu...

    (QB_NEW_EN)


    [grammar] ~56-~56: There might be a mistake here.
    Context: ...n the ID of the identifier being a doi. In case of a doi's ID of **[https://doi.or...

    (QB_NEW_EN)


    [grammar] ~58-~58: There might be a mistake here.
    Context: ... to the ID of the identifier. ## Step 7 - exifData Handling: Create a contextual...

    (QB_NEW_EN)


    [style] ~63-~63: This phrase might be redundant. Consider either removing or replacing the adjective ‘additional’.
    Context: ...pe CreativeWork with the respective ID, add an additional type of Dataset. Then, set the property conf...

    (ADD_AN_ADDITIONAL)


    [grammar] ~63-~63: Ensure spelling is correct
    Context: ...perties for the entity representing the another crate, since its content and further me...

    (QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


    [grammar] ~63-~63: There might be a mistake here.
    Context: ...from its own RO-Crate Metadata Document. -- A Paper: Create a Publication ent...

    (QB_NEW_EN)


    [grammar] ~67-~67: There might be a mistake here.
    Context: ...to an Organization entity. In case that such entity does not exist yet, we create an...

    (QB_NEW_EN)


    [grammar] ~69-~69: There might be a mistake here.
    Context: ...ttps://ror.org/01ej9dk98)**. ## Step 10 - Funder Handling: Create an Organizatio...

    (QB_NEW_EN)


    [grammar] ~72-~72: There might be a mistake here.
    Context: ...he Awardee and the Award ID. ## Step 11 - About Handling: If the respective ID i...

    (QB_NEW_EN)

    🪛 markdownlint-cli2 (0.17.2)
    Guide.md

    53-53: Bare URL used

    (MD034, no-bare-urls)

    Copy link

    @coderabbitai coderabbitai bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Actionable comments posted: 0

    ♻️ Duplicate comments (7)
    Guide.md (7)

    8-8: Fix grammar in Overview section.

    The phrase "ought to be able to provide" is grammatically awkward.

    -The created file may not be perfect but ought to be able to provide sufficient description of the dataset.
    +The created file may not be perfect but should provide sufficient description of the dataset.

    24-24: Fix grammar: "In detached package" → "In a detached package".

    The grammar issue was previously flagged but not fully addressed.

    -In detached package, i.e. the metadata file is not within the package, which is most likely for GigaDB, the filename ro-crate-metadata.json is renamed to xxxx-ro-crate-metadata.json, e.g. xxxx can be the dataset ID.
    +In a detached package (i.e., the metadata file is not within the package), which is most likely for GigaDB, the filename ro-crate-metadata.json is renamed to xxxx-ro-crate-metadata.json, e.g., xxxx can be the dataset ID.

    29-29: Split merged bullet points and fix grammatical errors.

    Two bullet points are merged, and there are grammatical errors in the ISO 8601 description.

    -- **Initialization of the Crate**: Create the empty crate, then set the profile to specify the context version and the root data entity ID.- **Initialization of the Root Data Entity**: Set ID, name, description, datePublished, i.e. the date of first publication, and sdDatePublished, i.e. the date on which the current structured data was generated or published. The dates are is ISO 8601 standard, e.g. YYYY-MM-DD. For GigaDB, the dataset is most likely to be web-based, the ID has to be an absolute URI, e.g. **[https://gigadb.org/dataset/102736](https://gigadb.org/dataset/102736)**.
    +- **Initialization of the Crate**: Create the empty crate, then set the profile to specify the context version and the root data entity ID.
    +- **Initialization of the Root Data Entity**: Set ID, name, description, datePublished (i.e., the date of first publication), and sdDatePublished (i.e., the date on which the current structured data was generated or published). The dates are in ISO 8601 format, e.g., YYYY-MM-DD. For GigaDB, the dataset is most likely to be web-based, the ID has to be an absolute URI, e.g., **[https://gigadb.org/dataset/102736](https://gigadb.org/dataset/102736)**.

    32-32: Fix typo: "downloable" → "downloadable".

    -The ID is recommended to be the corresponding downloable PNG.
    +The ID is recommended to be the corresponding downloadable PNG.

    42-43: Fix PRONOM JSON formatting and localhost URL.

    The PRONOM example has broken JSON formatting, and there's an incorrect localhost URL.

    -- **File**: Create a File entity with the respective ID, which has to be an absolute URI for a web-based entity. For a GigaDB dataset, it is most likely web-based, the ID is often selected to be the url that directly downloads the file. Then, we set the name, contentSize and encodingFormat. Note that the contentSize is either in kB or MB. Also, note that the encodingFormat is a plain string xxx/yyy, for instance, text/csv. In some cases that a more informative encodingFormat of the form xxx/yyy followed by a **[PRONOM](https://www.nationalarchives.gov.uk/PRONOM/Default.aspx)** identifier, for example, ["application/pdf", {"@id": **["https://www.nationalarchives.gov.uk/PRONOM/fmt/19"]("https://www.nationalarchives.gov.uk/PRONOM/fmt/19")**}]. Additionally, we can include some extra information including data types and file attributes using the exifData construct.
    -- **Directory/Dataset/zip file**: Create a Dataset entity with the respective ID, which has to be an absolute URI. Such URI should resolve to a listing of the content of the directory/dataset/zip file. For a GigaDB dataset, it is most likely a web-based zip file, the ID is often selected to be the url that shows its description, for example, **[https://gigadb.org/dataset/view/id/102736/Files_page/4](https://gigadb.org/dataset/view/id/102736/Files_page/4)**. Then, we set the name, description, distribution and releaseDate. Note that the distribution is the url that downloads the content, for example, **[https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/BoostNano-master.zip](http://127.0.0.1:5501/assets/ro-crate-preview.html#https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/BoostNano-master.zip)**. Also, note that the releaseDate should be in the ISO 8601 format. Furthermore, we can include some extra information including data types and file attributes using the exifData construct..
    +- **File**: Create a File entity with the respective ID, which has to be an absolute URI for a web-based entity. For a GigaDB dataset, it is most likely web-based, the ID is often selected to be the URL that directly downloads the file. Then, we set the name, contentSize and encodingFormat. Note that the contentSize is either in kB or MB. Also, note that the encodingFormat is a plain string xxx/yyy, for instance, text/csv. In some cases, a more informative encodingFormat of the form xxx/yyy followed by a **[PRONOM](https://www.nationalarchives.gov.uk/PRONOM/Default.aspx)** identifier, for example, ["application/pdf", {"@id": "https://www.nationalarchives.gov.uk/PRONOM/fmt/19"}]. Additionally, we can include some extra information including data types and file attributes using the exifData construct.
    +- **Directory/Dataset/zip file**: Create a Dataset entity with the respective ID, which has to be an absolute URI. Such URI should resolve to a listing of the content of the directory/dataset/zip file. For a GigaDB dataset, it is most likely a web-based zip file, the ID is often selected to be the URL that shows its description, for example, **[https://gigadb.org/dataset/view/id/102736/Files_page/4](https://gigadb.org/dataset/view/id/102736/Files_page/4)**. Then, we set the name, description, distribution and releaseDate. Note that the distribution is the URL that downloads the content, for example, **[https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/BoostNano-master.zip](https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/BoostNano-master.zip)**. Also, note that the releaseDate should be in the ISO 8601 format. Furthermore, we can include some extra information including data types and file attributes using the exifData construct.

    53-53: Fix grammar: "a Organization" → "an Organization".

    Also fix the bare URL issue flagged by markdownlint.

    -- **Publisher and sdPublisher Handling**: Create a Organization entity with the respective ID, then set the name and description of the organization, where the name and the description may have to be searched or recorded online. Also, set the contactPoint with usually the email following *mailto:*, e.g. **[mailto:database@gigasciencejournal.com](mailto:database@gigasciencejournal.com)**.
    +- **Publisher and sdPublisher Handling**: Create an Organization entity with the respective ID, then set the name and description of the organization, where the name and the description may have to be searched or recorded online. Also, set the contactPoint with usually the email following *mailto:*, e.g., **[mailto:database@gigasciencejournal.com](mailto:database@gigasciencejournal.com)**.

    56-56: Fix DOI value inconsistency.

    The DOI value doesn't match the DOI URL in the example.

    -In case of a doi's ID of **[https://doi.org/10.4225/59/59672c09f4a4b](https://doi.org/10.4225/59/59672c09f4a4b)**, the value is set to be a plain string of doi:10.5524/102736.
    +In case of a DOI's ID of **[https://doi.org/10.4225/59/59672c09f4a4b](https://doi.org/10.4225/59/59672c09f4a4b)**, the value is set to be a plain string of doi:10.4225/59/59672c09f4a4b.
    🧹 Nitpick comments (6)
    Guide.md (6)

    2-3: Minor grammar improvements needed for better readability.

    Several grammatical issues affect clarity throughout the guide.

    -Below are some notes to pay attention to when developers are using the tool to manipulate RO-Crate Metadata file. The note is GigaDB-oriented. This note is written to help ease the use of the tool to create or manipulate the RO-Crate Metadata file concerning GigaDB datasets by removing unnecessary and only emphasizing relevant technical details about the specific standard of RO-Crate 1.2.
    +Below are notes for developers using the tool to manipulate RO-Crate metadata files. This guide is GigaDB-oriented and written to ease the use of the tool for creating or manipulating RO-Crate metadata files for GigaDB datasets, focusing on relevant technical details of the RO-Crate 1.2 standard.

    22-22: Fix article usage: "name of an entity" → "the name of an entity".

    -In addition, name of an entity should be human-readable if it exists.
    +In addition, the name of an entity should be human-readable if it exists.

    30-30: Fix grammar issues in Components specification.

    Multiple grammatical issues need correction.

    -- **Specification of the Components**: Specify the ID of the files, dataset such as zip file using hasPart, possibly using the \# directory construct to collectively describe many files. Refer to **[<b>Step 2</b>](#step-2)** for handling the data entities of these files and datasets and potentially any entities derived from them. Note that metadata file and the preview file, if it exists, are specially treated and not included in hasPart.
    +- **Specification of the Components**: Specify the ID of the files and datasets (such as zip files) using hasPart, possibly using the \# directory construct to describe many files collectively. Refer to **[<b>Step 2</b>](#step-2)** for handling the data entities of these files and datasets and potentially any entities derived from them. Note that the metadata file and the preview file, if it exists, are specially treated and not included in hasPart.

    63-64: Fix grammar issues in Citation Handling section.

    Multiple grammatical issues need correction.

    -- **Citation Handling**: We will discuss the two cases when the publication is another dataset and a paper.
    --- **Another Dataset/Crate**: Create a Publication entity of type CreativeWork with the respective ID, add an additional type of Dataset. Then, set the property conformsTo to be the version-less generic RO-Crate profile **[https://w3id.org/ro/crate](https://w3id.org/ro/crate)**. Note that we do not set hasPart and usually other properties for the entity representing the another crate, since its content and further metadata is available from its own RO-Crate Metadata Document.
    --- **A Paper**: Create a Publication entity of type ScholarlyArticle with the respective ID. then set the name. Also, set the author, identifier, issn, journal, datePublished and creditText, if any. Note that author can has more than one value and datePublished should be in ISO 8601 format.
    +- **Citation Handling**: We will discuss the two cases when the publication is another dataset or a paper.
    +  - **Another Dataset/Crate**: Create a Publication entity of type CreativeWork with the respective ID, add a type of Dataset. Then, set the property conformsTo to be the version-less generic RO-Crate profile **[https://w3id.org/ro/crate](https://w3id.org/ro/crate)**. Note that we do not set hasPart and usually other properties for the entity representing another crate, since its content and further metadata is available from its own RO-Crate Metadata Document.
    +  - **A Paper**: Create a Publication entity of type ScholarlyArticle with the respective ID, then set the name. Also, set the author, identifier, issn, journal, datePublished and creditText, if any. Note that author can have more than one value and datePublished should be in ISO 8601 format.

    67-67: Fix grammar: "In case that" → "In case".

    -In case that such entity does not exist yet, we create an Organization entity with the respective ID, then set the name, where the name may have to be searched or recorded online.
    +In case such entity does not exist yet, we create an Organization entity with the respective ID, then set the name, where the name may have to be searched or recorded online.

    73-73: Fix grammar: "an url" → "a URL".

    -If the respective ID is an url, we create a contextual entity of type URL with the respective ID and set the name of the entity.
    +If the respective ID is a URL, we create a contextual entity of type URL with the respective ID and set the name of the entity.
    📜 Review details

    Configuration used: CodeRabbit UI
    Review profile: CHILL
    Plan: Pro

    💡 Knowledge Base configuration:

    • MCP integration is disabled by default for public repositories
    • Jira integration is disabled by default for public repositories
    • Linear integration is disabled by default for public repositories

    You can enable these sources in your CodeRabbit configuration.

    📥 Commits

    Reviewing files that changed from the base of the PR and between adff89d and 5f32ebf.

    📒 Files selected for processing (1)
    • Guide.md (1 hunks)
    🧰 Additional context used
    🪛 LanguageTool
    Guide.md

    [grammar] ~2-~2: There might be a mistake here.
    Context: # Usage Guide for ro-crate-php Below are some notes to pay attention to...

    (QB_NEW_EN)


    [grammar] ~7-~7: There might be a mistake here.
    Context: ...ndard of RO-Crate 1.2. --- ## Overview This is a PHP tool to create and manipul...

    (QB_NEW_EN)


    [grammar] ~8-~8: There might be a mistake here.
    Context: ... not be perfect but ought to be able to provide sufficient description of the dataset. ...

    (QB_NEW_EN)


    [grammar] ~10-~10: There might be a mistake here.
    Context: ...he repository above. Version: [1.0] Last Updated: [2025-08-18] --- ## No...

    (QB_NEW_EN)


    [grammar] ~22-~22: There might be a mistake here.
    Context: ... same ID has to be created. In addition, name of an entity should be human-reada...

    (QB_NEW_EN)


    [grammar] ~24-~24: There might be a mistake here.
    Context: ...e-preview.html as the @id and filename. In detached package, i.e. the metadata fil...

    (QB_NEW_EN)


    [grammar] ~28-~28: There might be a mistake here.
    Context: ...x can be the dataset ID. --- ## Step 1 - Initialization of the Crate: Create th...

    (QB_NEW_EN)


    [grammar] ~29-~29: There might be a mistake here.
    Context: ...6](https://gigadb.org/dataset/102736)**. - Specification of the Components: Speci...

    (QB_NEW_EN)


    [grammar] ~30-~30: There might be a mistake here.
    Context: ...ecify the ID of the files, dataset such as zip file using hasPart, possibly using ...

    (QB_NEW_EN)


    [style] ~30-~30: Try moving the adverb to make the sentence clearer.
    Context: ...asPart, possibly using the # directory construct to collectively describe many files. Refer to Step 2 ...

    (SPLIT_INFINITIVE)


    [grammar] ~30-~30: There might be a mistake here.
    Context: ...ly any entities derived from them. Note that metadata file and the preview file, if ...

    (QB_NEW_EN)


    [grammar] ~30-~30: There might be a mistake here.
    Context: ...lly treated and not included in hasPart. - Specification of the License: Specify ...

    (QB_NEW_EN)


    [grammar] ~32-~32: Ensure spelling is correct
    Context: ... is recommended to be the corresponding downloable PNG. Refer to **[Step 4](#step-4...

    (QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


    [grammar] ~35-~35: There might be a mistake here.
    Context: ...ation used with different file entities. - Specification of Citation: In case the...

    (QB_NEW_EN)


    [grammar] ~36-~36: There might be a mistake here.
    Context: ...as to be a URL (for example a DOI URL). In case of citing another dataset/crate, t...

    (QB_NEW_EN)


    [grammar] ~39-~39: There might be a mistake here.
    Context: ...n string that concatenates all keywords with comma as the delimiter. As a special co...

    (QB_NEW_EN)


    [grammar] ~39-~39: There might be a mistake here.
    Context: ...pecial construct, together with the use of keywords property, we have to specify t...

    (QB_NEW_EN)


    [grammar] ~42-~42: There might be a mistake here.
    Context: ...attributes using the exifData construct. - Directory/Dataset/zip file: Create a D...

    (QB_NEW_EN)


    [grammar] ~43-~43: There might be a mistake here.
    Context: ...ve ID, which has to be an absolute URI. Such URI should resolve to a listing of the ...

    (QB_NEW_EN)


    [grammar] ~43-~43: There might be a mistake here.
    Context: ...attributes using the exifData construct.. - Collective Construct with #: In case ...

    (QB_NEW_EN)


    [grammar] ~49-~49: There might be a mistake here.
    Context: ... searched or recorded online. ## Step 4 - Thumbnail Handling: When the thumbnail...

    (QB_NEW_EN)


    [grammar] ~52-~52: There might be a mistake here.
    Context: ...ntity with the respective ID. ## Step 5 - Publisher and sdPublisher Handling: Cr...

    (QB_NEW_EN)


    [grammar] ~55-~55: There might be a mistake here.
    Context: ...the contact of the publisher. ## Step 6 - Identifier Handling: Create a contextu...

    (QB_NEW_EN)


    [grammar] ~56-~56: There might be a mistake here.
    Context: ...n the ID of the identifier being a doi. In case of a doi's ID of **[https://doi.or...

    (QB_NEW_EN)


    [grammar] ~58-~58: There might be a mistake here.
    Context: ... to the ID of the identifier. ## Step 7 - exifData Handling: Create a contextual...

    (QB_NEW_EN)


    [style] ~63-~63: This phrase might be redundant. Consider either removing or replacing the adjective ‘additional’.
    Context: ...pe CreativeWork with the respective ID, add an additional type of Dataset. Then, set the property conf...

    (ADD_AN_ADDITIONAL)


    [grammar] ~63-~63: Ensure spelling is correct
    Context: ...perties for the entity representing the another crate, since its content and further me...

    (QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


    [grammar] ~63-~63: There might be a mistake here.
    Context: ...from its own RO-Crate Metadata Document. -- A Paper: Create a Publication ent...

    (QB_NEW_EN)


    [grammar] ~67-~67: There might be a mistake here.
    Context: ...to an Organization entity. In case that such entity does not exist yet, we create an...

    (QB_NEW_EN)


    [grammar] ~69-~69: There might be a mistake here.
    Context: ...ttps://ror.org/01ej9dk98)**. ## Step 10 - Funder Handling: Create an Organizatio...

    (QB_NEW_EN)


    [grammar] ~72-~72: There might be a mistake here.
    Context: ...he Awardee and the Award ID. ## Step 11 - About Handling: If the respective ID i...

    (QB_NEW_EN)

    🪛 markdownlint-cli2 (0.17.2)
    Guide.md

    53-53: Bare URL used

    (MD034, no-bare-urls)

    …uide are written with respect to the latest RO-Crate standard of 1.2.
    Copy link

    @coderabbitai coderabbitai bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Actionable comments posted: 0

    ♻️ Duplicate comments (4)
    Guide.md (4)

    28-30: Split merged bullets and fix ISO 8601 phrasing.

    Restore two bullets and correct wording.

     ## Step 1
    -- **Initialization of the Crate**: Create the empty crate, then set the profile to specify the context version and the root data entity ID.- **Initialization of the Root Data Entity**: Set ID, name, description, datePublished, i.e. the date of first publication, and sdDatePublished, i.e. the date on which the current structured data was generated or published. The dates are is ISO 8601 standard, e.g. YYYY-MM-DD. For GigaDB, the dataset is most likely to be web-based, the ID has to be an absolute URI, e.g. **[https://gigadb.org/dataset/102736](https://gigadb.org/dataset/102736)**.
    +- **Initialization of the Crate**: Create an empty crate, then set profile to specify the context version and the root data entity @id.
    +- **Initialization of the Root Data Entity**: Set @id, name, description, datePublished (date of first publication), and sdDatePublished (date the structured data was generated/published). Dates must be in ISO 8601 format (e.g., YYYY‑MM‑DD). For GigaDB, the dataset is typically web‑based; the @id must be an absolute URI (e.g., https://gigadb.org/dataset/102736).
     - **Specification of the Components**: Specify the ID of the files, dataset such as zip file using hasPart, possibly using the \# directory construct to collectively describe many files. Refer to **[<b>Step 2</b>](#step-2)** for handling the data entities of these files and datasets and potentially any entities derived from them. Note that metadata file and the preview file, if it exists, are specially treated and not included in hasPart.

    And adjust the trailing sentence for article/parenthetical:

    -Note that metadata file and the preview file, if it exists, are specially treated and not included in hasPart.
    +Note that the metadata file and the preview file (if present) are specially treated and not included in hasPart.

    31-32: Fix license name and thumbnail typo.

    “CC0 v1.0” → “CC0 1.0”; “downloable PNG” → “downloadable PNG”.

    --  **Specification of the License**: Specify the ID of the license using license, e.g. *[https://creativecommons.org/publicdomain/zero/1.0/](https://creativecommons.org/publicdomain/zero/1.0/)* for the CC0 v1.0 license. Refer to **[<b>Step 3</b>](#step-3)** for handling the contextual entity of the license.
    +-  **Specification of the License**: Specify the ID of the license using license, e.g. *[https://creativecommons.org/publicdomain/zero/1.0/](https://creativecommons.org/publicdomain/zero/1.0/)* for the CC0 1.0 license. Refer to **[<b>Step 3</b>](#step-3)** for handling the contextual entity of the license.
    --  **Specification of the Thumbnail**: Specify the ID of the thumbnail using thumbnail. The ID is recommended to be the corresponding downloable PNG. Refer to **[<b>Step 4</b>](#step-4)** for handling the contextual entity of the thumbnail.
    +-  **Specification of the Thumbnail**: Specify the ID of the thumbnail using thumbnail. The ID is recommended to be the corresponding downloadable PNG. Refer to **[<b>Step 4</b>](#step-4)** for handling the contextual entity of the thumbnail.

    42-43: Fix broken PRONOM example and localhost distribution link.

    JSON/Markdown is malformed and the distribution URL should be the direct download, not a localhost hash.

    -- **File**: Create a File entity with the respective ID, which has to be an absolute URI for a web-based entity. For a GigaDB dataset, it is most likely web-based, the ID is often selected to be the url that directly downloads the file. Then, we set the name, contentSize and encodingFormat. Note that the contentSize is either in kB or MB. Also, note that the encodingFormat is a plain string xxx/yyy, for instance, text/csv. In some cases that a more informative encodingFormat of the form xxx/yyy followed by a **[PRONOM](https://www.nationalarchives.gov.uk/PRONOM/Default.aspx)** identifier, for example, ["application/pdf", {"@id": **["https://www.nationalarchives.gov.uk/PRONOM/fmt/19"]("https://www.nationalarchives.gov.uk/PRONOM/fmt/19")**}]. Additionally, we can include some extra information including data types and file attributes using the exifData construct.
    +- **File**: Create a File entity with an absolute‑URI @id (for web‑based files, usually the direct download URL). Set name, contentSize, and encodingFormat (MIME type, e.g., text/csv). Optionally pair encodingFormat with a PRONOM identifier, e.g., ["application/pdf", {"@id":"https://www.nationalarchives.gov.uk/PRONOM/fmt/19"}]. You may also include extra information (e.g., data types, file attributes) using the exifData construct.
    -- **Directory/Dataset/zip file**: Create a Dataset entity with the respective ID, which has to be an absolute URI. Such URI should resolve to a listing of the content of the directory/dataset/zip file. For a GigaDB dataset, it is most likely a web-based zip file, the ID is often selected to be the url that shows its description, for example, **[https://gigadb.org/dataset/view/id/102736/Files_page/4](https://gigadb.org/dataset/view/id/102736/Files_page/4)**. Then, we set the name, description, distribution and releaseDate. Note that the distribution is the url that downloads the content, for example, **[https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/BoostNano-master.zip](http://127.0.0.1:5501/assets/ro-crate-preview.html#https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/BoostNano-master.zip)**. Also, note that the releaseDate should be in the ISO 8601 format. Furthermore, we can include some extra information including data types and file attributes using the exifData construct..
    +- **Directory/Dataset/zip file**: Create a Dataset entity with an absolute‑URI @id that resolves to a listing of the content. For GigaDB, the @id is often the dataset page (e.g., **https://gigadb.org/dataset/view/id/102736/Files_page/4**). Then set name, description, distribution, and releaseDate. distribution should be the direct download URL (e.g., **https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/BoostNano-master.zip**). releaseDate must be ISO 8601. You may include extra information (e.g., data types, file attributes) using the exifData construct.

    56-57: Fix DOI example mismatch and capitalize “URL”.

    Make value consistent with the DOI @id.

    -- **Identifier Handling**: Create a contextual entity of type PropertyValue with the respective ID, then set the propertyID, value and url. For example, the propertyID is **[https://registry.identifiers.org/registry/doi](https://registry.identifiers.org/registry/doi)** given the ID of the identifier being a doi. In case of a doi's ID of **[https://doi.org/10.4225/59/59672c09f4a4b](https://doi.org/10.4225/59/59672c09f4a4b)**, the value is set to be a plain string of doi:10.5524/102736. The url is often chosen to be identical to the ID of the identifier.
    +- **Identifier Handling**: Create a contextual entity of type PropertyValue with the respective @id, then set propertyID, value, and URL. For example, propertyID can be **https://registry.identifiers.org/registry/doi** when the identifier is a DOI. If the identifier @id is **https://doi.org/10.4225/59/59672c09f4a4b**, set value to the plain string `doi:10.4225/59/59672c09f4a4b`. The URL is typically identical to the identifier’s @id.
    🧹 Nitpick comments (10)
    Guide.md (10)

    16-23: Tighten the “Note” block; fix run-ons and ambiguity.

    Clarify @id vs literal usage, uniqueness, and add-to-crate timing.

    -The general rule is that we use the @id construct (<b>true</b> flag if using the add/removePropertyPair methods) when referring to another entity, we otherwise use a plain literal (<b>false</b> flag if using the add/removePropertyPair methods). There are exceptions for specific constructs not following the rules.
    -
    -Another reminder is to add the entity to the crate before or after the creation of the entity.
    -
    -Also, only one entity with the same ID has to be created.
    -
    -In addition, name of an entity should be human-readable if it exists.
    +As a rule: when referencing another entity use an @id value (set the <b>true</b> flag with add/removePropertyPair); otherwise use a plain literal (<b>false</b> flag). Some constructs are exceptions.
    +
    +Add each entity to the crate when you create it (before you reference it elsewhere).
    +
    +Each @id must be unique—create only one entity per @id.
    +
    +Provide a human‑readable name for each entity when available.

    24-24: Detached package wording: ensure @id matches filename.

    Improve precision and match RO‑Crate conventions.

    -The metadata file always has ro-crate-metadata.json as the @id. The preview file has ro-crate-preview.html as the @id and filename. In detached package, i.e. the metadata file is not within the package, which is most likely for GigaDB, the filename ro-crate-metadata.json is renamed to xxxx-ro-crate-metadata.json, e.g. xxxx can be the dataset ID.
    +The metadata document always has ro-crate-metadata.json as its @id (and filename). The preview file uses ro-crate-preview.html as its @id (and filename). In a detached package (i.e., when the metadata file is kept outside the package)—common for GigaDB—the metadata filename is typically <datasetId>-ro-crate-metadata.json. Ensure the @id matches the filename.

    35-35: Clean up repeated placeholders in “Extra/Additional Information”.

    Replace duplicate “Additional information/Github links” with concise examples.

    --  **Specification of Extra/Additional Information**: In case there is metadata that cannot be precisely described using existing properties, there is a special construct for it. Specify an exifData using a local identifier such as \#extraInfo as an @id. Refer to **[<b>Step 7</b>](#step-7)** for handling the contextual entity of the exifData. In a GigaDB dataset, information of the root dataset including Dataset type , Additional information , Additional information , Additional information , Additional information , Additional information , Additional information , Additional information , Additional information , Github links , Github links , Github links , Github links , Accessions (data not in GigaDB) and History can be wrapped by this construct. Note that this construct also works for other entities, e.g. Awardee and Award ID used with the organization entity for the funder, or Extra Information used with different file entities.
    +-  **Specification of Extra/Additional Information**: When metadata cannot be precisely captured with existing properties, create an exifData entity using a local @id (e.g., \#extraInfo) and attach name/value pairs. Refer to **[<b>Step 7</b>](#step-7)**. For GigaDB, this can capture fields like Dataset type, Additional information, GitHub links, Accessions (external data), and History. This construct also applies to other entities (e.g., Awardee and Award ID on a funder Organization, or extra info on specific Files).

    39-39: Clarify keywords guidance; fix grammar and @id pluralization.

    Use “comma-separated” and “@id values”; capitalize URL.

    --  **Specification of Keywords**: Specify the keyword(s) of the root dataset using keywords as a plain string that concatenates all keywords with comma as the delimiter. As a special construct, together with the use of keywords property, we have to specify the IDs of these keyword(s) one by one using about as @id's. Such ID is usually a url that explains the corresponding keyword, for example, **[https://nanoporetech.com/](https://nanoporetech.com/)** for the keyword of oxford nanopore technologies. Refer to **[<b>Step 11</b>](#step-11)** for handling the contextual entity of the about property.
    +-  **Specification of Keywords**: Provide keywords as a comma‑separated string in keywords. Additionally, specify about as @id values for each keyword (URLs that explain the term), e.g., **https://nanoporetech.com/** for “Oxford Nanopore Technologies”. Refer to **[<b>Step 11</b>](#step-11)** for handling the contextual entity of about.

    50-50: Subject–verb agreement in Step 4.

    “creates” → “create”.

    -- **Thumbnail Handling**: When the thumbnail is incidental to the root dataset, usually the case, we do not include it in the hasPart of the root data entity and creates a File entity with the respective ID.
    +- **Thumbnail Handling**: When the thumbnail is incidental to the root dataset (usually the case), do not include it in hasPart of the root data entity; create a File entity with the respective @id.

    53-53: Simplify publisher/sdPublisher text; improve clarity and capitalization.

    Shorten and standardize ContactPoint guidance.

    -- **Publisher and sdPublisher Handling**: Create an Organization entity with the respective ID, then set the name and description of the organization, where the name and the description may have to be searched or recorded online. Also, set the contactPoint with usually the email following *mailto:*, e.g. **[mailto:database@gigasciencejournal.com](mailto:database@gigasciencejournal.com)**. Then, create a contactPoint entity with this respective ID, and set the contactType, email and identifier. For the case of the example ID, the email and identifier can share a plain string database@gigasciencejournal.com, while the contactType may be a plain string saying the contact of the publisher.
    +- **Publisher and sdPublisher Handling**: Create an Organization entity with the publisher’s @id; set its name and description (may need to be sourced). Set contactPoint to a mailto: URL (e.g., **mailto:database@gigasciencejournal.com**). Create a ContactPoint entity with that @id and set contactType, email, and identifier. In this example, email and identifier can both be `database@gigasciencejournal.com`; contactType may describe the publisher contact.

    63-64: Tighten citation guidance; fix grammar and redundancy.

    Remove “add an additional”, fix “the another”, “can have”, and “ISSN”.

    -- **Another Dataset/Crate**: Create a Publication entity of type CreativeWork with the respective ID, add an additional type of Dataset. Then, set the property conformsTo to be the version-less generic RO-Crate profile **[https://w3id.org/ro/crate](https://w3id.org/ro/crate)**. Note that we do not set hasPart and usually other properties for the entity representing the another crate, since its content and further metadata is available from its own RO-Crate Metadata Document.
    --- **A Paper**: Create a Publication entity of type ScholarlyArticle with the respective ID. then set the name. Also, set the author, identifier, issn, journal, datePublished and creditText, if any. Note that author can has more than one value and datePublished should be in ISO 8601 format.
    +- **Another Dataset/Crate**: Create a Publication entity of type CreativeWork with the respective @id and add type Dataset. Set conformsTo to the version‑less RO‑Crate profile **https://w3id.org/ro/crate**. Do not set hasPart or other properties; the cited crate’s content/metadata are available from its own RO‑Crate Metadata document.
    +- **A Paper**: Create a Publication entity of type ScholarlyArticle with the respective @id, then set name. Also set author, identifier, ISSN, journal, datePublished, and creditText (if any). Note that author can have more than one value and datePublished should be ISO 8601.

    67-67: Smoothen author handling phrasing.

    “If … does not exist yet” → “If … does not exist”.

    -- **Author Handling**: Create a Person entity with the respective ID, then set the affiliation and the name. The affiliation should refer to an Organization entity. In case that such entity does not exist yet, we create an Organization entity with the respective ID, then set the name, where the name may have to be searched or recorded online. For a GigaDB dataset, a ror is often picked as the ID for the organization, for instance, **[https://ror.org/01ej9dk98](https://ror.org/01ej9dk98)**.
    +- **Author Handling**: Create a Person entity with the respective @id, then set affiliation and name. The affiliation should refer to an Organization entity. If such an entity does not exist, create an Organization with the respective @id and set its name (which may need to be sourced). For GigaDB, a ROR is often used for the organization @id (e.g., **https://ror.org/01ej9dk98**).

    70-70: Clarify funder identifier and description.

    Tie identifier to @id; quote description.

    -- **Funder Handling**: Create an Organization entity with the respective ID, then set the identifier, name and description. The identifier is always to be the same as the ID, and the description is Funding Body in this case. Additionally, we can use the exifData construct to include the information regarding the Awardee and the Award ID.
    +- **Funder Handling**: Create an Organization entity with the respective @id; set identifier (same as the @id), name, and description (e.g., "Funding Body"). Optionally use the exifData construct to include Awardee and Award ID.

    73-73: Article/case fix: “an url” → “a URL”; add @id reference.

    -- **About Handling**: If the respective ID is an url, we create a contextual entity of type URL with the respective ID and set the name of the entity.
    +- **About Handling**: If the respective @id is a URL, create a contextual entity of type URL with that @id and set its name.
    📜 Review details

    Configuration used: CodeRabbit UI

    Review profile: CHILL

    Plan: Pro

    💡 Knowledge Base configuration:

    • MCP integration is disabled by default for public repositories
    • Jira integration is disabled by default for public repositories
    • Linear integration is disabled by default for public repositories

    You can enable these sources in your CodeRabbit configuration.

    📥 Commits

    Reviewing files that changed from the base of the PR and between 5f32ebf and 023d0a6.

    📒 Files selected for processing (1)
    • Guide.md (1 hunks)
    🧰 Additional context used
    🪛 LanguageTool
    Guide.md

    [grammar] ~2-~2: There might be a mistake here.
    Context: # Usage Guide for ro-crate-php Below are some notes to pay attention to...

    (QB_NEW_EN)


    [grammar] ~3-~3: There might be a mistake here.
    Context: ...e is 1.2-oriented, and some constraints in standard of RO-Crate 1.1 are no longer ...

    (QB_NEW_EN)


    [grammar] ~7-~7: There might be a mistake here.
    Context: ...n the standard of 1.1. --- ## Overview This is a PHP tool to create and manipul...

    (QB_NEW_EN)


    [grammar] ~8-~8: There might be a mistake here.
    Context: ... not be perfect but ought to be able to provide sufficient description of the dataset. ...

    (QB_NEW_EN)


    [grammar] ~10-~10: There might be a mistake here.
    Context: ...he repository above. Version: [1.0] Last Updated: [2025-08-18] --- ## No...

    (QB_NEW_EN)


    [grammar] ~22-~22: There might be a mistake here.
    Context: ... same ID has to be created. In addition, name of an entity should be human-reada...

    (QB_NEW_EN)


    [grammar] ~24-~24: There might be a mistake here.
    Context: ...e-preview.html as the @id and filename. In detached package, i.e. the metadata fil...

    (QB_NEW_EN)


    [grammar] ~28-~28: There might be a mistake here.
    Context: ...x can be the dataset ID. --- ## Step 1 - Initialization of the Crate: Create th...

    (QB_NEW_EN)


    [grammar] ~29-~29: There might be a mistake here.
    Context: ...6](https://gigadb.org/dataset/102736)**. - Specification of the Components: Speci...

    (QB_NEW_EN)


    [style] ~30-~30: Try moving the adverb to make the sentence clearer.
    Context: ...asPart, possibly using the # directory construct to collectively describe many files. Refer to Step 2 ...

    (SPLIT_INFINITIVE)


    [grammar] ~30-~30: There might be a mistake here.
    Context: ...ly any entities derived from them. Note that metadata file and the preview file, if ...

    (QB_NEW_EN)


    [grammar] ~30-~30: There might be a mistake here.
    Context: ...lly treated and not included in hasPart. - Specification of the License: Specify ...

    (QB_NEW_EN)


    [grammar] ~32-~32: Ensure spelling is correct
    Context: ... is recommended to be the corresponding downloable PNG. Refer to **[Step 4](#step-4...

    (QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


    [grammar] ~35-~35: There might be a mistake here.
    Context: ...ation used with different file entities. - Specification of Citation: In case the...

    (QB_NEW_EN)


    [grammar] ~36-~36: There might be a mistake here.
    Context: ...as to be a URL (for example a DOI URL). In case of citing another dataset/crate, t...

    (QB_NEW_EN)


    [grammar] ~39-~39: There might be a mistake here.
    Context: ...n string that concatenates all keywords with comma as the delimiter. As a special co...

    (QB_NEW_EN)


    [grammar] ~39-~39: There might be a mistake here.
    Context: ...pecial construct, together with the use of keywords property, we have to specify t...

    (QB_NEW_EN)


    [grammar] ~42-~42: There might be a mistake here.
    Context: ...attributes using the exifData construct. - Directory/Dataset/zip file: Create a D...

    (QB_NEW_EN)


    [grammar] ~43-~43: There might be a mistake here.
    Context: ...ve ID, which has to be an absolute URI. Such URI should resolve to a listing of the ...

    (QB_NEW_EN)


    [grammar] ~43-~43: There might be a mistake here.
    Context: ...attributes using the exifData construct.. - Collective Construct with #: In case ...

    (QB_NEW_EN)


    [grammar] ~49-~49: There might be a mistake here.
    Context: ... searched or recorded online. ## Step 4 - Thumbnail Handling: When the thumbnail...

    (QB_NEW_EN)


    [grammar] ~52-~52: There might be a mistake here.
    Context: ...ntity with the respective ID. ## Step 5 - Publisher and sdPublisher Handling: Cr...

    (QB_NEW_EN)


    [grammar] ~55-~55: There might be a mistake here.
    Context: ...the contact of the publisher. ## Step 6 - Identifier Handling: Create a contextu...

    (QB_NEW_EN)


    [grammar] ~56-~56: There might be a mistake here.
    Context: ...n the ID of the identifier being a doi. In case of a doi's ID of **[https://doi.or...

    (QB_NEW_EN)


    [grammar] ~58-~58: There might be a mistake here.
    Context: ... to the ID of the identifier. ## Step 7 - exifData Handling: Create a contextual...

    (QB_NEW_EN)


    [style] ~63-~63: This phrase might be redundant. Consider either removing or replacing the adjective ‘additional’.
    Context: ...pe CreativeWork with the respective ID, add an additional type of Dataset. Then, set the property conf...

    (ADD_AN_ADDITIONAL)


    [grammar] ~63-~63: Ensure spelling is correct
    Context: ...perties for the entity representing the another crate, since its content and further me...

    (QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


    [grammar] ~63-~63: There might be a mistake here.
    Context: ...from its own RO-Crate Metadata Document. -- A Paper: Create a Publication ent...

    (QB_NEW_EN)


    [grammar] ~67-~67: There might be a mistake here.
    Context: ...to an Organization entity. In case that such entity does not exist yet, we create an...

    (QB_NEW_EN)


    [grammar] ~69-~69: There might be a mistake here.
    Context: ...ttps://ror.org/01ej9dk98)**. ## Step 10 - Funder Handling: Create an Organizatio...

    (QB_NEW_EN)


    [grammar] ~72-~72: There might be a mistake here.
    Context: ...he Awardee and the Award ID. ## Step 11 - About Handling: If the respective ID i...

    (QB_NEW_EN)

    🪛 markdownlint-cli2 (0.17.2)
    Guide.md

    53-53: Bare URL used

    (MD034, no-bare-urls)

    Copy link

    @coderabbitai coderabbitai bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Actionable comments posted: 3

    ♻️ Duplicate comments (1)
    README.md (1)

    8-8: Fix Composer install command and format as a fenced block.

    Replace placeholder package with the real one and use proper casing for Composer.

    -Install the tool using composer:
    ->composer require alex/my-php-app
    +Install with Composer:
    +```sh
    +composer require gigascience/ro-crate-php
    +```
    🧹 Nitpick comments (9)
    README.md (9)

    3-3: Clarify product name.

    Use the RO‑Crate term explicitly.

    -This is a PHP tool to create and manipulate Research Object Crate.
    +This is a PHP tool to create and manipulate RO‑Crate metadata.

    11-11: Tighten wording and capitalization.

    “PHPDoc” is the common styling.

    -Inline phpDoc comments are written.
    +Inline PHPDoc blocks are included.

    35-38: De-duplicate operations and keep only what adds value.

    The addEntity call is already shown above; keep remove example.

    -Interact with the crate just like normal objects with methods:
    -> $crate->addEntity($author);
    -> <br> $crate->removeEntity($author->getId());
    +Interact with the crate just like normal objects with methods:
    +
    +```php
    +$crate->removeEntity($author->getId());
    +```

    39-44: Use a single chained example as code, not HTML entities.

    Also, no need for “RECOMMENDED:” in prose.

    -RECOMMENDED: Chain up the methods to enhance the compactness of the code when adding/removing properties of an entity:
    -> $root->addPropertyPair('creator', '#bob', true)
    -> <br> &ensp; ->addPropertyPair('creator', '#cathy')
    -> <br> &ensp; ->removePropertyPair('creator', '#alice')
    -> <br> &ensp; ->addPropertyPair('creator', '#alice');
    +You can chain methods when adding/removing properties:
    +
    +```php
    +$root
    +  ->addPropertyPair('creator', '#bob', true)
    +  ->addPropertyPair('creator', '#cathy')
    +  ->removePropertyPair('creator', '#alice')
    +  ->addPropertyPair('creator', '#alice');
    +```

    45-45: Tighten and correct grammar in API explanation.

    -The methods of addPropertyPair and removePropertyPair abstract away the details about the file formatting. The user only has to provide the key for the property of the entity and the value to be added to or removed from the property. An optional boolean argument is to tell whether the value should be treated as a plain literal with false or as an identifier referencing other entity in the crate. By default, the flag is set to be comply with the previous value of this property if any.
    +The addPropertyPair and removePropertyPair methods abstract the on-disk JSON-LD formatting. Provide the property key and the value to add or remove. The optional boolean indicates whether the value is a plain literal (false) or an identifier referencing another entity (true). By default, the flag matches the existing value’s representation, if any.

    47-50: Format the preview generator example and state output artifact.

    Consider showing the namespace import if applicable.

    -Use the following code to generate a human-readable HTML preview from the RO-Crate Metadata File.
    -> ROCratePreviewGenerator::generatePreview(__DIR__ . '/../resources');
    +Use the following code to generate a human‑readable HTML preview from the RO‑Crate metadata file:
    +
    +```php
    +ROCratePreviewGenerator::generatePreview(__DIR__ . '/../resources'); // writes ro-crate-preview.html under resources/
    +```
    +
    +If ROCratePreviewGenerator is namespaced, show the appropriate `use` statement in the snippet.

    54-55: Fix typos and improve readability of the GigaDB example paragraph.

    -An example based on GigaDB dataset 102736 is generated using the code in index.php to illustarte how the tool can be utilized. The inforamtion to generate the metadata file is mainly from the *[Website](https://gigadb.org/dataset/102736)*. The example ro-crate-metadata.json and ro-crate-preview.html are stored in the assets directory. To make the file easier to read, only the first 2 files and a zipped file treated as a directory are included, where the remaining files are described using a collective directory construct. To ensure integribility and compatibility with other/downstream applications, the metadata file of this example is imported into **[RoHub](https://www.rohub.org/3543b082-9077-492e-a4c7-a3b7c8bb39e8?activetab=overview)** for testing, where the about property of the metadat descriptor and the id of the root data entity are replaced with ./ to be backward compatible to the RO-Crate 1.1 standard adopted by **[RoHub](https://www.rohub.org/about?what_is_rohub)**.
    +An example based on GigaDB dataset 102736 is generated using the code in index.php to illustrate how the tool can be used. The information comes mainly from the *[website](https://gigadb.org/dataset/102736)*. The example ro-crate-metadata.json and ro-crate-preview.html are stored in the assets directory. To make the file easier to read, only the first two files and a zipped file (treated as a directory) are included; the remaining files are described using a collective directory construct. To ensure interoperability and compatibility with downstream applications, the example metadata file was imported into **[RoHub](https://www.rohub.org/3543b082-9077-492e-a4c7-a3b7c8bb39e8?activetab=overview)** for testing, where the `about` property of the metadata descriptor and the id of the root data entity were set to `./` for backward compatibility with the RO‑Crate 1.1 standard adopted by **[RoHub](https://www.rohub.org/about?what_is_rohub)**.

    18-18: Remove hard tabs (MD010) and stray extra spaces.

    These render inconsistently; your fenced code rewrites above also resolve them.

    Also applies to: 22-22


    7-9: Add “Development” notes (tests/demo) from the PR description.

    Surface docker-compose commands in README for discoverability.

    Proposed addition after Install:

    +## Development
    +
    +Run unit tests:
    +
    +```sh
    +docker-compose run test
    +```
    +
    +Run the example (creates ro-crate-metadata.json and ro-crate-preview.html under resources/):
    +
    +```sh
    +docker-compose up -d --build
    +# then open http://localhost:8080/
    +```
    📜 Review details

    Configuration used: CodeRabbit UI

    Review profile: CHILL

    Plan: Pro

    💡 Knowledge Base configuration:

    • MCP integration is disabled by default for public repositories
    • Jira integration is disabled by default for public repositories
    • Linear integration is disabled by default for public repositories

    You can enable these sources in your CodeRabbit configuration.

    📥 Commits

    Reviewing files that changed from the base of the PR and between 023d0a6 and 86a562a.

    📒 Files selected for processing (1)
    • README.md (1 hunks)
    🧰 Additional context used
    🪛 LanguageTool
    README.md

    [grammar] ~15-~15: There might be a mistake here.
    Context: ... new empty crate with the base path set to resources directory in the parent direc...

    (QB_NEW_EN)


    [grammar] ~23-~23: There might be a mistake here.
    Context: ...s', true); Add an entity to the crate: > // A person >
    $author = $crate->cre...

    (QB_NEW_EN)


    [grammar] ~24-~24: There might be a mistake here.
    Context: ...dd an entity to the crate: > // A person >
    $author = $crate->createGenericEn...

    (QB_NEW_EN)


    [grammar] ~25-~25: There might be a mistake here.
    Context: ...eateGenericEntity('#alice', ['Person']); >
    $author->addProperty('name', 'Ali...

    (QB_NEW_EN)


    [grammar] ~26-~26: There might be a mistake here.
    Context: ...hor->addProperty('name', 'Alice Smith'); >
    $author->addProperty('affiliation...

    (QB_NEW_EN)


    [grammar] ~27-~27: There might be a mistake here.
    Context: ...affiliation', 'Institution of Example'); >
    // Add the person to the crate > ...

    (QB_NEW_EN)


    [grammar] ~28-~28: There might be a mistake here.
    Context: ...); >
    // Add the person to the crate >
    $crate->addEntity($author); > > ...

    (QB_NEW_EN)


    [grammar] ~32-~32: There might be a mistake here.
    Context: ...>
    $root = $crate->getRootDataset(); >
    $root->addPropertyPair('creator',...

    (QB_NEW_EN)


    [grammar] ~35-~35: There might be a mistake here.
    Context: ...e just like normal objects with methods: > $crate->addEntity($author); >
    $cr...

    (QB_NEW_EN)


    [grammar] ~36-~36: There might be a mistake here.
    Context: ... methods: > $crate->addEntity($author); >
    $crate->removeEntity($author->ge...

    (QB_NEW_EN)


    [grammar] ~39-~39: There might be a mistake here.
    Context: ...adding/removing properties of an entity: > $root->addPropertyPair('creator', '#bob'...

    (QB_NEW_EN)


    [grammar] ~40-~40: There might be a mistake here.
    Context: ...addPropertyPair('creator', '#bob', true) >
      ->addPropertyPair('creator...

    (QB_NEW_EN)


    [grammar] ~41-~41: There might be a mistake here.
    Context: ...; ->addPropertyPair('creator', '#cathy') >
      ->removePropertyPair('crea...

    (QB_NEW_EN)


    [grammar] ~42-~42: There might be a mistake here.
    Context: ...>removePropertyPair('creator', '#alice') >
      ->addPropertyPair('creator...

    (QB_NEW_EN)


    [grammar] ~45-~45: Ensure spelling is correct
    Context: ...h false or as an identifier referencing other entity in the crate. By default, the fl...

    (QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


    [grammar] ~51-~51: There might be a mistake here.
    Context: ...R__ . '/../resources'); ## Change Log The tool is currently under development....

    (QB_NEW_EN)


    [grammar] ~54-~54: There might be a mistake here.
    Context: ...ly under development. ## GigaDB Example An example based on GigaDB dataset 10273...

    (QB_NEW_EN)


    [grammar] ~55-~55: Ensure spelling is correct
    Context: ...enerated using the code in index.php to illustarte how the tool can be utilized. The infor...

    (QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


    [grammar] ~55-~55: Ensure spelling is correct
    Context: ...tarte how the tool can be utilized. The inforamtion to generate the metadata file is mainly...

    (QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


    [grammar] ~55-~55: Ensure spelling is correct
    Context: ...esting, where the about property of the metadat descriptor and the id of the root data ...

    (QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)

    🪛 markdownlint-cli2 (0.17.2)
    README.md

    18-18: Hard tabs
    Column: 1

    (MD010, no-hard-tabs)


    22-22: Hard tabs
    Column: 1

    (MD010, no-hard-tabs)

    README.md Outdated
    Comment on lines 1 to 55
    # ro-crate-php
    Tool in PHP for manipulating RO-crate objects

    This is a PHP tool to create and manipulate Research Object Crate.

    ## Install

    Install the tool using composer:
    >composer require alex/my-php-app

    ## Docs
    Inline phpDoc comments are written.

    ## Usage

    Create a new empty crate with the base path set to resources directory in the parent directory:

    > $crate = new ROCrate(\_\_DIR\_\_ . '/../resources', false);

    The `ROCrate` constructor enables the creation of a crate using an existing metadata file:

    > $crate = new ROCrate(\_\_DIR\_\_ . '/../resources', true);

    Add an entity to the crate:
    > // A person
    > <br> $author = $crate->createGenericEntity('#alice', ['Person']);
    > <br> $author->addProperty('name', 'Alice Smith');
    > <br> $author->addProperty('affiliation', 'Institution of Example');
    > <br> // Add the person to the crate
    > <br> \$crate->addEntity($author);
    >
    > // Adds the person as one of the creators of the root data entity, i.e. the dataset being described by the crate
    > <br> $root = $crate->getRootDataset();
    > <br> $root->addPropertyPair('creator', '#alice', true);

    Interact with the crate just like normal objects with methods:
    > \$crate->addEntity($author);
    > <br> \$crate->removeEntity($author->getId());

    RECOMMENDED: Chain up the methods to enhance the compactness of the code when adding/removing properties of an entity:
    > $root->addPropertyPair('creator', '#bob', true)
    > <br> &ensp; ->addPropertyPair('creator', '#cathy')
    > <br> &ensp; ->removePropertyPair('creator', '#alice')
    > <br> &ensp; ->addPropertyPair('creator', '#alice');

    The methods of addPropertyPair and removePropertyPair abstract away the details about the file formatting. The user only has to provide the key for the property of the entity and the value to be added to or removed from the property. An optional boolean argument is to tell whether the value should be treated as a plain literal with false or as an identifier referencing other entity in the crate. By default, the flag is set to be comply with the previous value of this property if any.

    ## HTML Rendering
    Use the following code to generate a human-readable HTML preview from the RO-Crate Metadata File.
    > ROCratePreviewGenerator::generatePreview(\_\_DIR\_\_ . '/../resources');

    ## Change Log
    The tool is currently under development.

    ## GigaDB Example
    An example based on GigaDB dataset 102736 is generated using the code in index.php to illustarte how the tool can be utilized. The inforamtion to generate the metadata file is mainly from the *[Website](https://gigadb.org/dataset/102736)*. The example ro-crate-metadata.json and ro-crate-preview.html are stored in the assets directory. To make the file easier to read, only the first 2 files and a zipped file treated as a directory are included, where the remaining files are described using a collective directory construct. To ensure integribility and compatibility with other/downstream applications, the metadata file of this example is imported into **[RoHub](https://www.rohub.org/3543b082-9077-492e-a4c7-a3b7c8bb39e8?activetab=overview)** for testing, where the about property of the metadat descriptor and the id of the root data entity are replaced with ./ to be backward compatible to the RO-Crate 1.1 standard adopted by **[RoHub](https://www.rohub.org/about?what_is_rohub)**. No newline at end of file
    Copy link

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    💡 Verification agent

    🧩 Analysis chain

    Sanity-check namespaces in usage snippets.

    If classes live under a namespace (per PSR‑4), show use statements or fully qualified names in the examples to prevent copy‑paste errors.

    Run the script to confirm namespaces and the Composer package name used in README:


    🏁 Script executed:

    #!/bin/bash
    set -euo pipefail
    
    echo "Composer package name:"
    jq -r '.name' composer.json
    
    echo "Public classes under src/:"
    fd -e php src | xargs -I{} awk 'NR==1,NR==50{print FILENAME":"$0} NR==50{exit}' {}
    
    echo "README occurrences of package name and tabs:"
    rg -n 'gigascience/ro-crate-php|alex/my-php-app|\t' README.md || true

    Length of output: 488


    🏁 Script executed:

    #!/bin/bash
    set -euo pipefail
    
    echo "PSR-4 autoload mapping:"
    jq '.autoload["psr-4"]' composer.json
    
    echo
    echo "List of PHP files in project:"
    fd -e php . || find . -type f -name '*.php'
    
    echo
    echo "Namespace declarations in those files:"
    rg -nP '^\s*namespace\s+\S+' -n .

    Length of output: 3088


    Correct PSR-4 namespaces and package name in README usage snippets. Replace the install command with composer require gigascience/ro-crate-php and add appropriate use ROCrate\ROCrate; and use ROCrate\ROCratePreviewGenerator; (or fully qualify) before instantiating those classes in all code examples.

    🧰 Tools
    🪛 LanguageTool

    [grammar] ~15-~15: There might be a mistake here.
    Context: ... new empty crate with the base path set to resources directory in the parent direc...

    (QB_NEW_EN)


    [grammar] ~23-~23: There might be a mistake here.
    Context: ...s', true); Add an entity to the crate: > // A person >
    $author = $crate->cre...

    (QB_NEW_EN)


    [grammar] ~24-~24: There might be a mistake here.
    Context: ...dd an entity to the crate: > // A person >
    $author = $crate->createGenericEn...

    (QB_NEW_EN)


    [grammar] ~25-~25: There might be a mistake here.
    Context: ...eateGenericEntity('#alice', ['Person']); >
    $author->addProperty('name', 'Ali...

    (QB_NEW_EN)


    [grammar] ~26-~26: There might be a mistake here.
    Context: ...hor->addProperty('name', 'Alice Smith'); >
    $author->addProperty('affiliation...

    (QB_NEW_EN)


    [grammar] ~27-~27: There might be a mistake here.
    Context: ...affiliation', 'Institution of Example'); >
    // Add the person to the crate > ...

    (QB_NEW_EN)


    [grammar] ~28-~28: There might be a mistake here.
    Context: ...); >
    // Add the person to the crate >
    $crate->addEntity($author); > > ...

    (QB_NEW_EN)


    [grammar] ~32-~32: There might be a mistake here.
    Context: ...>
    $root = $crate->getRootDataset(); >
    $root->addPropertyPair('creator',...

    (QB_NEW_EN)


    [grammar] ~35-~35: There might be a mistake here.
    Context: ...e just like normal objects with methods: > $crate->addEntity($author); >
    $cr...

    (QB_NEW_EN)


    [grammar] ~36-~36: There might be a mistake here.
    Context: ... methods: > $crate->addEntity($author); >
    $crate->removeEntity($author->ge...

    (QB_NEW_EN)


    [grammar] ~39-~39: There might be a mistake here.
    Context: ...adding/removing properties of an entity: > $root->addPropertyPair('creator', '#bob'...

    (QB_NEW_EN)


    [grammar] ~40-~40: There might be a mistake here.
    Context: ...addPropertyPair('creator', '#bob', true) >
      ->addPropertyPair('creator...

    (QB_NEW_EN)


    [grammar] ~41-~41: There might be a mistake here.
    Context: ...; ->addPropertyPair('creator', '#cathy') >
      ->removePropertyPair('crea...

    (QB_NEW_EN)


    [grammar] ~42-~42: There might be a mistake here.
    Context: ...>removePropertyPair('creator', '#alice') >
      ->addPropertyPair('creator...

    (QB_NEW_EN)


    [grammar] ~45-~45: Ensure spelling is correct
    Context: ...h false or as an identifier referencing other entity in the crate. By default, the fl...

    (QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


    [grammar] ~51-~51: There might be a mistake here.
    Context: ...R__ . '/../resources'); ## Change Log The tool is currently under development....

    (QB_NEW_EN)


    [grammar] ~54-~54: There might be a mistake here.
    Context: ...ly under development. ## GigaDB Example An example based on GigaDB dataset 10273...

    (QB_NEW_EN)


    [grammar] ~55-~55: Ensure spelling is correct
    Context: ...enerated using the code in index.php to illustarte how the tool can be utilized. The infor...

    (QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


    [grammar] ~55-~55: Ensure spelling is correct
    Context: ...tarte how the tool can be utilized. The inforamtion to generate the metadata file is mainly...

    (QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


    [grammar] ~55-~55: Ensure spelling is correct
    Context: ...esting, where the about property of the metadat descriptor and the id of the root data ...

    (QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)

    🪛 markdownlint-cli2 (0.17.2)

    18-18: Hard tabs
    Column: 1

    (MD010, no-hard-tabs)


    22-22: Hard tabs
    Column: 1

    (MD010, no-hard-tabs)

    🤖 Prompt for AI Agents
    In README.md around lines 1 to 55, update the usage and install snippets to use
    the correct PSR-4 package and namespaces: change the composer require example to
    composer require gigascience/ro-crate-php and add appropriate use statements
    (use ROCrate\ROCrate; use ROCrate\ROCratePreviewGenerator;) or fully qualify
    class names in all code examples before instantiating ROCrate or
    ROCratePreviewGenerator so examples reflect the actual package and PSR-4
    namespaces.
    

    Comment on lines +15 to +22
    Create a new empty crate with the base path set to resources directory in the parent directory:

    > $crate = new ROCrate(\_\_DIR\_\_ . '/../resources', false);

    The `ROCrate` constructor enables the creation of a crate using an existing metadata file:

    > $crate = new ROCrate(\_\_DIR\_\_ . '/../resources', true);

    Copy link

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    🛠️ Refactor suggestion

    Fix grammar and convert examples to proper code blocks (remove tabs/HTML escapes).

    Also remove extra spaces and Markdown escapes.

    -Create a new empty crate with the base path set to resources directory in the parent directory:
    -
    -> $crate = new  ROCrate(__DIR__  .  '/../resources', false);
    -	
    -The `ROCrate` constructor enables the creation of a crate using an existing metadata file:
    -
    -> $crate = new ROCrate(__DIR__ . '/../resources', true);
    +Create a new empty crate with the base path set to the resources directory in the parent directory:
    +
    +```php
    +$crate = new ROCrate(__DIR__ . '/../resources', false);
    +```
    +
    +Create a crate from an existing metadata file:
    +
    +```php
    +$crate = new ROCrate(__DIR__ . '/../resources', true);
    +```
    🧰 Tools
    🪛 LanguageTool

    [grammar] ~15-~15: There might be a mistake here.
    Context: ... new empty crate with the base path set to resources directory in the parent direc...

    (QB_NEW_EN)

    🪛 markdownlint-cli2 (0.17.2)

    18-18: Hard tabs
    Column: 1

    (MD010, no-hard-tabs)


    22-22: Hard tabs
    Column: 1

    (MD010, no-hard-tabs)

    🤖 Prompt for AI Agents
    In README.md around lines 15 to 22, fix grammar and formatting of the two
    examples: replace the inline/tabbed and HTML-escaped examples with proper fenced
    PHP code blocks, remove extraneous spaces and backslash-escaped underscores, and
    ensure the constructor examples use __DIR__ . '/../resources' with no extra
    spaces; wrap each example in ```php``` fences and provide a one-line descriptive
    sentence before each block.
    

    Comment on lines +23 to +34
    Add an entity to the crate:
    > // A person
    > <br> $author = $crate->createGenericEntity('#alice', ['Person']);
    > <br> $author->addProperty('name', 'Alice Smith');
    > <br> $author->addProperty('affiliation', 'Institution of Example');
    > <br> // Add the person to the crate
    > <br> \$crate->addEntity($author);
    >
    > // Adds the person as one of the creators of the root data entity, i.e. the dataset being described by the crate
    > <br> $root = $crate->getRootDataset();
    > <br> $root->addPropertyPair('creator', '#alice', true);

    Copy link

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    🛠️ Refactor suggestion

    Rewrite “Add an entity” example using fenced code (drop
    and escapes).

    Keeps the same API, just readable Markdown.

    -Add an entity to the crate:
    -> // A person
    -> <br> $author = $crate->createGenericEntity('#alice', ['Person']);
    -> <br> $author->addProperty('name', 'Alice Smith');
    -> <br> $author->addProperty('affiliation', 'Institution of Example');
    -> <br> // Add the person to the crate
    -> <br> $crate->addEntity($author);
    ->
    -> // Adds the person as one of the creators of the root data entity, i.e. the dataset being described by the crate
    -> <br> $root = $crate->getRootDataset();
    -> <br> $root->addPropertyPair('creator', '#alice', true);
    +Add an entity to the crate:
    +
    +```php
    +// A person
    +$author = $crate->createGenericEntity('#alice', ['Person']);
    +$author->addProperty('name', 'Alice Smith');
    +$author->addProperty('affiliation', 'Institution of Example');
    +
    +// Add the person to the crate
    +$crate->addEntity($author);
    +
    +// Add as a creator of the root dataset (the dataset described by the crate)
    +$root = $crate->getRootDataset();
    +$root->addPropertyPair('creator', '#alice', true);
    +```
    🧰 Tools
    🪛 LanguageTool

    [grammar] ~23-~23: There might be a mistake here.
    Context: ...s', true); Add an entity to the crate: > // A person >
    $author = $crate->cre...

    (QB_NEW_EN)


    [grammar] ~24-~24: There might be a mistake here.
    Context: ...dd an entity to the crate: > // A person >
    $author = $crate->createGenericEn...

    (QB_NEW_EN)


    [grammar] ~25-~25: There might be a mistake here.
    Context: ...eateGenericEntity('#alice', ['Person']); >
    $author->addProperty('name', 'Ali...

    (QB_NEW_EN)


    [grammar] ~26-~26: There might be a mistake here.
    Context: ...hor->addProperty('name', 'Alice Smith'); >
    $author->addProperty('affiliation...

    (QB_NEW_EN)


    [grammar] ~27-~27: There might be a mistake here.
    Context: ...affiliation', 'Institution of Example'); >
    // Add the person to the crate > ...

    (QB_NEW_EN)


    [grammar] ~28-~28: There might be a mistake here.
    Context: ...); >
    // Add the person to the crate >
    $crate->addEntity($author); > > ...

    (QB_NEW_EN)


    [grammar] ~32-~32: There might be a mistake here.
    Context: ...>
    $root = $crate->getRootDataset(); >
    $root->addPropertyPair('creator',...

    (QB_NEW_EN)

    🤖 Prompt for AI Agents
    In README.md around lines 23-34 the "Add an entity" example uses HTML <br> tags
    and escaped dollar signs, making the Markdown unreadable; replace that segment
    with a fenced code block using ```php, remove all <br> tags and backslash
    escapes so variables appear normally (e.g., $author, $crate), keep the API calls
    identical and retain the final closing ``` fence — in short, wrap the existing
    example code in a proper PHP code fence and remove the HTML and escape
    artifacts.
    

    Copy link

    @coderabbitai coderabbitai bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Actionable comments posted: 6

    ♻️ Duplicate comments (2)
    README.md (2)

    8-8: Composer package name is correct.


    23-34: Rewrite “Add an entity” example using fenced PHP; remove
    and escapes.

    -Add an entity to the crate:
    -> // A person
    -> <br> $author = $crate->createGenericEntity('#alice', ['Person']);
    -> <br> $author->addProperty('name', 'Alice Smith');
    -> <br> $author->addProperty('affiliation', 'Institution of Example');
    -> <br> // Add the person to the crate
    -> <br> \$crate->addEntity($author);
    ->
    -> // Adds the person as one of the creators of the root data entity, i.e. the dataset being described by the crate
    -> <br> $root = $crate->getRootDataset();
    -> <br> $root->addPropertyPair('creator', '#alice', true);
    +Add an entity to the crate:
    +
    +```php
    +// A person
    +$author = $crate->createGenericEntity('#alice', ['Person']);
    +$author->addProperty('name', 'Alice Smith');
    +$author->addProperty('affiliation', 'Institution of Example');
    +
    +// Add the person to the crate
    +$crate->addEntity($author);
    +
    +// Add as a creator of the root dataset (the dataset described by the crate)
    +$root = $crate->getRootDataset();
    +$root->addPropertyPair('creator', '#alice', true);
    +```
    🧹 Nitpick comments (3)
    README.md (3)

    7-8: Format install command as fenced code (and drop blockquote).

    -Install the tool using composer:
    ->composer require gigascience/ro-crate-php
    +Install the tool using Composer:
    +
    +```sh
    +composer require gigascience/ro-crate-php
    +```

    10-12: Strengthen Docs section (link to guide) and clarify status.

    -## Docs
    -Inline phpDoc comments are written.
    +## Docs
    +See the user guide and API phpDoc for details:
    +- [Guide](Guide.md)
    +- Inline phpDoc comments in source files

    51-53: Add Quickstart (Docker) and Testing sections to match PR description.

     ## Change Log
     The tool is currently under development.
    +
    +## Quickstart (Docker)
    +Build and run the example web app (serves `index.php` at http://localhost:8080/):
    +
    +```sh
    +docker-compose up -d --build
    +```
    +
    +## Testing
    +Run unit tests:
    +
    +```sh
    +docker-compose run --rm test
    +```
    📜 Review details

    Configuration used: CodeRabbit UI

    Review profile: CHILL

    Plan: Pro

    💡 Knowledge Base configuration:

    • MCP integration is disabled by default for public repositories
    • Jira integration is disabled by default for public repositories
    • Linear integration is disabled by default for public repositories

    You can enable these sources in your CodeRabbit configuration.

    📥 Commits

    Reviewing files that changed from the base of the PR and between 86a562a and 749ff87.

    📒 Files selected for processing (1)
    • README.md (1 hunks)
    🧰 Additional context used
    🪛 LanguageTool
    README.md

    [grammar] ~15-~15: There might be a mistake here.
    Context: ... new empty crate with the base path set to resources directory in the parent direc...

    (QB_NEW_EN)


    [grammar] ~23-~23: There might be a mistake here.
    Context: ...s', true); Add an entity to the crate: > // A person >
    $author = $crate->cre...

    (QB_NEW_EN)


    [grammar] ~24-~24: There might be a mistake here.
    Context: ...dd an entity to the crate: > // A person >
    $author = $crate->createGenericEn...

    (QB_NEW_EN)


    [grammar] ~25-~25: There might be a mistake here.
    Context: ...eateGenericEntity('#alice', ['Person']); >
    $author->addProperty('name', 'Ali...

    (QB_NEW_EN)


    [grammar] ~26-~26: There might be a mistake here.
    Context: ...hor->addProperty('name', 'Alice Smith'); >
    $author->addProperty('affiliation...

    (QB_NEW_EN)


    [grammar] ~27-~27: There might be a mistake here.
    Context: ...affiliation', 'Institution of Example'); >
    // Add the person to the crate > ...

    (QB_NEW_EN)


    [grammar] ~28-~28: There might be a mistake here.
    Context: ...); >
    // Add the person to the crate >
    $crate->addEntity($author); > > ...

    (QB_NEW_EN)


    [grammar] ~32-~32: There might be a mistake here.
    Context: ...>
    $root = $crate->getRootDataset(); >
    $root->addPropertyPair('creator',...

    (QB_NEW_EN)


    [grammar] ~35-~35: There might be a mistake here.
    Context: ...e just like normal objects with methods: > $crate->addEntity($author); >
    $cr...

    (QB_NEW_EN)


    [grammar] ~36-~36: There might be a mistake here.
    Context: ... methods: > $crate->addEntity($author); >
    $crate->removeEntity($author->ge...

    (QB_NEW_EN)


    [grammar] ~39-~39: There might be a mistake here.
    Context: ...adding/removing properties of an entity: > $root->addPropertyPair('creator', '#bob'...

    (QB_NEW_EN)


    [grammar] ~40-~40: There might be a mistake here.
    Context: ...addPropertyPair('creator', '#bob', true) >
      ->addPropertyPair('creator...

    (QB_NEW_EN)


    [grammar] ~41-~41: There might be a mistake here.
    Context: ...; ->addPropertyPair('creator', '#cathy') >
      ->removePropertyPair('crea...

    (QB_NEW_EN)


    [grammar] ~42-~42: There might be a mistake here.
    Context: ...>removePropertyPair('creator', '#alice') >
      ->addPropertyPair('creator...

    (QB_NEW_EN)


    [grammar] ~45-~45: Ensure spelling is correct
    Context: ...h false or as an identifier referencing other entity in the crate. By default, the fl...

    (QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


    [grammar] ~51-~51: There might be a mistake here.
    Context: ...R__ . '/../resources'); ## Change Log The tool is currently under development....

    (QB_NEW_EN)


    [grammar] ~54-~54: There might be a mistake here.
    Context: ...ly under development. ## GigaDB Example An example based on GigaDB dataset 10273...

    (QB_NEW_EN)


    [grammar] ~55-~55: Ensure spelling is correct
    Context: ...enerated using the code in index.php to illustarte how the tool can be utilized. The infor...

    (QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


    [grammar] ~55-~55: Ensure spelling is correct
    Context: ...tarte how the tool can be utilized. The inforamtion to generate the metadata file is mainly...

    (QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


    [grammar] ~55-~55: Ensure spelling is correct
    Context: ...esting, where the about property of the metadat descriptor and the id of the root data ...

    (QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)

    🪛 markdownlint-cli2 (0.17.2)
    README.md

    18-18: Hard tabs
    Column: 1

    (MD010, no-hard-tabs)


    22-22: Hard tabs
    Column: 1

    (MD010, no-hard-tabs)

    🔇 Additional comments (2)
    README.md (2)

    47-50: Add namespace import and fenced PHP block for HTML preview
    Use the ROCrate\ROCratePreviewGenerator class in a fenced PHP code block:

    -## HTML Rendering
    -Use the following code to generate a human-readable HTML preview from the RO-Crate Metadata File.
    -> ROCratePreviewGenerator::generatePreview(__DIR__ . '/../resources');
    +## HTML Rendering
    +Generate a human-readable HTML preview from the RO-Crate metadata file:
    +
    +```php
    +<?php
    +use ROCrate\ROCratePreviewGenerator;
    +
    +ROCratePreviewGenerator::generatePreview(__DIR__ . '/../resources');
    +```

    Namespace ROCrate\ROCratePreviewGenerator confirmed.


    15-22: Add autoload and namespace to usage examples

    -Create a new empty crate with the base path set to resources directory in the parent directory:
    -
    -> $crate = new  ROCrate(__DIR__  .  '/../resources', false);
    -	
    -The `ROCrate` constructor enables the creation of a crate using an existing metadata file:
    -
    -> $crate = new ROCrate(__DIR__ . '/../resources', true);
    +Create a new empty crate with the base path set to the resources directory in the parent directory:
    +
    +```php
    +<?php
    +require __DIR__ . '/vendor/autoload.php';
    +
    +use ROCrate\ROCrate;
    +
    +$crate = new ROCrate(__DIR__ . '/../resources', false);
    +```
    +
    +Create a crate from an existing metadata file:
    +
    +```php
    +<?php
    +require __DIR__ . '/vendor/autoload.php';
    +
    +use ROCrate\ROCrate;
    +
    +$crate = new ROCrate(__DIR__ . '/../resources', true);
    +```

    Namespaces align with "ROCrate\\": "src/rocrate/" in composer.json.

    README.md Outdated
    Comment on lines 1 to 55
    # ro-crate-php
    Tool in PHP for manipulating RO-crate objects

    This is a PHP tool to create and manipulate Research Object Crate.

    ## Install

    Install the tool using composer:
    >composer require gigascience/ro-crate-php

    ## Docs
    Inline phpDoc comments are written.

    ## Usage

    Create a new empty crate with the base path set to resources directory in the parent directory:

    > $crate = new ROCrate(\_\_DIR\_\_ . '/../resources', false);

    The `ROCrate` constructor enables the creation of a crate using an existing metadata file:

    > $crate = new ROCrate(\_\_DIR\_\_ . '/../resources', true);

    Add an entity to the crate:
    > // A person
    > <br> $author = $crate->createGenericEntity('#alice', ['Person']);
    > <br> $author->addProperty('name', 'Alice Smith');
    > <br> $author->addProperty('affiliation', 'Institution of Example');
    > <br> // Add the person to the crate
    > <br> \$crate->addEntity($author);
    >
    > // Adds the person as one of the creators of the root data entity, i.e. the dataset being described by the crate
    > <br> $root = $crate->getRootDataset();
    > <br> $root->addPropertyPair('creator', '#alice', true);

    Interact with the crate just like normal objects with methods:
    > \$crate->addEntity($author);
    > <br> \$crate->removeEntity($author->getId());

    RECOMMENDED: Chain up the methods to enhance the compactness of the code when adding/removing properties of an entity:
    > $root->addPropertyPair('creator', '#bob', true)
    > <br> &ensp; ->addPropertyPair('creator', '#cathy')
    > <br> &ensp; ->removePropertyPair('creator', '#alice')
    > <br> &ensp; ->addPropertyPair('creator', '#alice');

    The methods of addPropertyPair and removePropertyPair abstract away the details about the file formatting. The user only has to provide the key for the property of the entity and the value to be added to or removed from the property. An optional boolean argument is to tell whether the value should be treated as a plain literal with false or as an identifier referencing other entity in the crate. By default, the flag is set to be comply with the previous value of this property if any.

    ## HTML Rendering
    Use the following code to generate a human-readable HTML preview from the RO-Crate Metadata File.
    > ROCratePreviewGenerator::generatePreview(\_\_DIR\_\_ . '/../resources');

    ## Change Log
    The tool is currently under development.

    ## GigaDB Example
    An example based on GigaDB dataset 102736 is generated using the code in index.php to illustarte how the tool can be utilized. The inforamtion to generate the metadata file is mainly from the *[Website](https://gigadb.org/dataset/102736)*. The example ro-crate-metadata.json and ro-crate-preview.html are stored in the assets directory. To make the file easier to read, only the first 2 files and a zipped file treated as a directory are included, where the remaining files are described using a collective directory construct. To ensure integribility and compatibility with other/downstream applications, the metadata file of this example is imported into **[RoHub](https://www.rohub.org/3543b082-9077-492e-a4c7-a3b7c8bb39e8?activetab=overview)** for testing, where the about property of the metadat descriptor and the id of the root data entity are replaced with ./ to be backward compatible to the RO-Crate 1.1 standard adopted by **[RoHub](https://www.rohub.org/about?what_is_rohub)**. No newline at end of file
    Copy link

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    🛠️ Refactor suggestion

    Resolve markdownlint issues (tabs, blockquotes) by adopting fenced code.

    Your adoption of fenced code blocks in the above suggestions will also fix MD010 “no-hard-tabs” and remove unnecessary >, <br>, &ensp;, and backslash escapes throughout the examples.

    🧰 Tools
    🪛 LanguageTool

    [grammar] ~15-~15: There might be a mistake here.
    Context: ... new empty crate with the base path set to resources directory in the parent direc...

    (QB_NEW_EN)


    [grammar] ~23-~23: There might be a mistake here.
    Context: ...s', true); Add an entity to the crate: > // A person >
    $author = $crate->cre...

    (QB_NEW_EN)


    [grammar] ~24-~24: There might be a mistake here.
    Context: ...dd an entity to the crate: > // A person >
    $author = $crate->createGenericEn...

    (QB_NEW_EN)


    [grammar] ~25-~25: There might be a mistake here.
    Context: ...eateGenericEntity('#alice', ['Person']); >
    $author->addProperty('name', 'Ali...

    (QB_NEW_EN)


    [grammar] ~26-~26: There might be a mistake here.
    Context: ...hor->addProperty('name', 'Alice Smith'); >
    $author->addProperty('affiliation...

    (QB_NEW_EN)


    [grammar] ~27-~27: There might be a mistake here.
    Context: ...affiliation', 'Institution of Example'); >
    // Add the person to the crate > ...

    (QB_NEW_EN)


    [grammar] ~28-~28: There might be a mistake here.
    Context: ...); >
    // Add the person to the crate >
    $crate->addEntity($author); > > ...

    (QB_NEW_EN)


    [grammar] ~32-~32: There might be a mistake here.
    Context: ...>
    $root = $crate->getRootDataset(); >
    $root->addPropertyPair('creator',...

    (QB_NEW_EN)


    [grammar] ~35-~35: There might be a mistake here.
    Context: ...e just like normal objects with methods: > $crate->addEntity($author); >
    $cr...

    (QB_NEW_EN)


    [grammar] ~36-~36: There might be a mistake here.
    Context: ... methods: > $crate->addEntity($author); >
    $crate->removeEntity($author->ge...

    (QB_NEW_EN)


    [grammar] ~39-~39: There might be a mistake here.
    Context: ...adding/removing properties of an entity: > $root->addPropertyPair('creator', '#bob'...

    (QB_NEW_EN)


    [grammar] ~40-~40: There might be a mistake here.
    Context: ...addPropertyPair('creator', '#bob', true) >
      ->addPropertyPair('creator...

    (QB_NEW_EN)


    [grammar] ~41-~41: There might be a mistake here.
    Context: ...; ->addPropertyPair('creator', '#cathy') >
      ->removePropertyPair('crea...

    (QB_NEW_EN)


    [grammar] ~42-~42: There might be a mistake here.
    Context: ...>removePropertyPair('creator', '#alice') >
      ->addPropertyPair('creator...

    (QB_NEW_EN)


    [grammar] ~45-~45: Ensure spelling is correct
    Context: ...h false or as an identifier referencing other entity in the crate. By default, the fl...

    (QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


    [grammar] ~51-~51: There might be a mistake here.
    Context: ...R__ . '/../resources'); ## Change Log The tool is currently under development....

    (QB_NEW_EN)


    [grammar] ~54-~54: There might be a mistake here.
    Context: ...ly under development. ## GigaDB Example An example based on GigaDB dataset 10273...

    (QB_NEW_EN)


    [grammar] ~55-~55: Ensure spelling is correct
    Context: ...enerated using the code in index.php to illustarte how the tool can be utilized. The infor...

    (QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


    [grammar] ~55-~55: Ensure spelling is correct
    Context: ...tarte how the tool can be utilized. The inforamtion to generate the metadata file is mainly...

    (QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


    [grammar] ~55-~55: Ensure spelling is correct
    Context: ...esting, where the about property of the metadat descriptor and the id of the root data ...

    (QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)

    🪛 markdownlint-cli2 (0.17.2)

    18-18: Hard tabs
    Column: 1

    (MD010, no-hard-tabs)


    22-22: Hard tabs
    Column: 1

    (MD010, no-hard-tabs)

    🤖 Prompt for AI Agents
    In README.md around lines 1 to 55, the examples use blockquote markers (>) plus
    inline HTML (<br>, &ensp;) and backslash-escaped dollar signs, which triggers
    markdownlint issues (MD010 and others); replace each quoted example block with
    fenced code blocks (triple backticks) specifying the language (e.g., ```php for
    PHP snippets or ```text for shell), remove all leading '>' markers, remove
    in-line HTML tags (&ensp;, <br>) and backslash escapes before $ (use raw $
    inside code fences), and convert any hard tabs to spaces to satisfy MD010.
    

    # ro-crate-php
    Tool in PHP for manipulating RO-crate objects

    This is a PHP tool to create and manipulate Research Object Crate.
    Copy link

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    ⚠️ Potential issue

    Use the correct term “RO-Crate” and tighten wording.

    -This is a PHP tool to create and manipulate Research Object Crate.
    +This is a PHP tool to create and manipulate RO‑Crate metadata.
    📝 Committable suggestion

    ‼️ IMPORTANT
    Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

    Suggested change
    This is a PHP tool to create and manipulate Research Object Crate.
    This is a PHP tool to create and manipulate RO-Crate metadata.
    🤖 Prompt for AI Agents
    In README.md around line 3, the wording uses "Research Object Crate" instead of
    the standard term "RO-Crate" and could be tightened; update the sentence to use
    "RO-Crate" (with the hyphen and capitalization) and rephrase for clarity and
    conciseness (e.g., "A PHP tool to create and manipulate RO-Crate.").
    

    Comment on lines +35 to +38
    Interact with the crate just like normal objects with methods:
    > \$crate->addEntity($author);
    > <br> \$crate->removeEntity($author->getId());

    Copy link

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    🛠️ Refactor suggestion

    Replace blockquoted lines with a proper code fence.

    -Interact with the crate just like normal objects with methods:
    -> \$crate->addEntity($author);
    -> <br> \$crate->removeEntity($author->getId());
    +Interact with the crate just like normal objects with methods:
    +
    +```php
    +$crate->addEntity($author);
    +$crate->removeEntity($author->getId());
    +```
    🧰 Tools
    🪛 LanguageTool

    [grammar] ~35-~35: There might be a mistake here.
    Context: ...e just like normal objects with methods: > $crate->addEntity($author); >
    $cr...

    (QB_NEW_EN)


    [grammar] ~36-~36: There might be a mistake here.
    Context: ... methods: > $crate->addEntity($author); >
    $crate->removeEntity($author->ge...

    (QB_NEW_EN)

    🤖 Prompt for AI Agents
    In README.md around lines 35 to 38, replace the blockquoted example lines and
    embedded HTML (<br> and leading > markers) with a proper fenced code block using
    the php language tag; remove the > prefixes and the <br> tag, wrap the two PHP
    statements inside a ```php ... ``` fence and ensure the closing ``` is present.
    

    Comment on lines +39 to +44
    RECOMMENDED: Chain up the methods to enhance the compactness of the code when adding/removing properties of an entity:
    > $root->addPropertyPair('creator', '#bob', true)
    > <br> &ensp; ->addPropertyPair('creator', '#cathy')
    > <br> &ensp; ->removePropertyPair('creator', '#alice')
    > <br> &ensp; ->addPropertyPair('creator', '#alice');

    Copy link

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    🛠️ Refactor suggestion

    Chainable example: use fenced PHP and remove HTML entities.

    -RECOMMENDED: Chain up the methods to enhance the compactness of the code when adding/removing properties of an entity:
    -> $root->addPropertyPair('creator', '#bob', true)
    -> <br> &ensp; ->addPropertyPair('creator', '#cathy')
    -> <br> &ensp; ->removePropertyPair('creator', '#alice')
    -> <br> &ensp; ->addPropertyPair('creator', '#alice');
    +RECOMMENDED: Chain methods to keep entity property updates compact:
    +
    +```php
    +$root->addPropertyPair('creator', '#bob', true)
    +     ->addPropertyPair('creator', '#cathy')
    +     ->removePropertyPair('creator', '#alice')
    +     ->addPropertyPair('creator', '#alice');
    +```
    🧰 Tools
    🪛 LanguageTool

    [grammar] ~39-~39: There might be a mistake here.
    Context: ...adding/removing properties of an entity: > $root->addPropertyPair('creator', '#bob'...

    (QB_NEW_EN)


    [grammar] ~40-~40: There might be a mistake here.
    Context: ...addPropertyPair('creator', '#bob', true) >
      ->addPropertyPair('creator...

    (QB_NEW_EN)


    [grammar] ~41-~41: There might be a mistake here.
    Context: ...; ->addPropertyPair('creator', '#cathy') >
      ->removePropertyPair('crea...

    (QB_NEW_EN)


    [grammar] ~42-~42: There might be a mistake here.
    Context: ...>removePropertyPair('creator', '#alice') >
      ->addPropertyPair('creator...

    (QB_NEW_EN)

    🤖 Prompt for AI Agents
    In README.md around lines 39 to 44, the chainable example uses HTML line breaks
    and entities; replace that block with a fenced PHP code block (```php ... ```)
    containing the chained method calls without any HTML (<br>, &ensp;) and
    formatted with each chained call on its own indented line, ending with the
    semicolon; ensure the fence markers and language tag are present and there is no
    inline HTML.
    

    > <br> &ensp; ->removePropertyPair('creator', '#alice')
    > <br> &ensp; ->addPropertyPair('creator', '#alice');

    The methods of addPropertyPair and removePropertyPair abstract away the details about the file formatting. The user only has to provide the key for the property of the entity and the value to be added to or removed from the property. An optional boolean argument is to tell whether the value should be treated as a plain literal with false or as an identifier referencing other entity in the crate. By default, the flag is set to be comply with the previous value of this property if any.
    Copy link

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    ⚠️ Potential issue

    Fix grammar and clarity.

    -The methods of addPropertyPair and removePropertyPair abstract away the details about the file formatting. The user only has to provide the key for the property of the entity and the value to be added to or removed from the property. An optional boolean argument is to tell whether the value should be treated as a plain literal with false or as an identifier referencing other entity in the crate. By default, the flag is set to be comply with the previous value of this property if any.
    +The addPropertyPair and removePropertyPair methods abstract away file‑format details. Provide the property key and the value to add or remove. The optional boolean indicates whether the value is a plain literal (false) or an identifier referencing another entity (true). By default, the flag matches the previous value of this property, if any.
    📝 Committable suggestion

    ‼️ IMPORTANT
    Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

    Suggested change
    The methods of addPropertyPair and removePropertyPair abstract away the details about the file formatting. The user only has to provide the key for the property of the entity and the value to be added to or removed from the property. An optional boolean argument is to tell whether the value should be treated as a plain literal with false or as an identifier referencing other entity in the crate. By default, the flag is set to be comply with the previous value of this property if any.
    The addPropertyPair and removePropertyPair methods abstract away file-format details. Provide the property key and the value to add or remove. The optional boolean indicates whether the value is a plain literal (false) or an identifier referencing another entity (true). By default, the flag matches the previous value of this property, if any.
    🧰 Tools
    🪛 LanguageTool

    [grammar] ~45-~45: Ensure spelling is correct
    Context: ...h false or as an identifier referencing other entity in the crate. By default, the fl...

    (QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)

    🤖 Prompt for AI Agents
    In README.md around line 45, the paragraph is grammatically unclear and
    confusing about the addPropertyPair/removePropertyPair behavior; rewrite it to
    be concise and clear: explain that both methods hide file-format details,
    require a property key and a value to add or remove, and accept an optional
    boolean that indicates whether the value is a plain literal (false) or an
    identifier referencing another entity (true); also clarify the default behavior:
    if not provided, the flag preserves the existing value type/formatting for that
    property when present.
    

    Copy link

    @coderabbitai coderabbitai bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Actionable comments posted: 0

    ♻️ Duplicate comments (13)
    resources/ro-crate-preview-out.html (7)

    7-7: Remove empty stylesheet link

    Empty href triggers a useless request. Drop the tag or point to a real CSS file.

    -        <link rel="stylesheet" href="">
    +        <!-- stylesheet intentionally omitted (inline styles below) -->

    228-233: Fill empty headings and names — use filename as fallback

    Two entities render with empty

    and empty “name”. Use basename of @id as a safe fallback.

    -                    <h2></h2>
    +                    <h2>results/ERR599171.merged.motus.tsv</h2>
    -                    <ul><li><span class="property"> @id </span>: results/ERR599171.merged.motus.tsv</li><li><span class="property"> @type </span>: File</li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: text/plain</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: </li></ul>
    +                    <ul><li><span class="property"> @id </span>: results/ERR599171.merged.motus.tsv</li><li><span class="property"> @type </span>: File</li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: text/plain</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: results/ERR599171.merged.motus.tsv</li></ul>
    -                    <h2></h2>
    +                    <h2>results/ERR599171.merged.unfiltered_fasta</h2>
    -                    <ul><li><span class="property"> @id </span>: results/ERR599171.merged.unfiltered_fasta</li><li><span class="property"> @ type </span>: File</li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: text/plain</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: </li></ul>
    +                    <ul><li><span class="property"> @id </span>: results/ERR599171.merged.unfiltered_fasta</li><li><span class="property"> @type </span>: File</li><li><span class="property">encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: text/plain</li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: results/ERR599171.merged.unfiltered_fasta</li></ul>

    Please also fix this in the generator so preview artifacts consistently fall back to basename when name is empty.


    251-295: encodingFormat for .fa.gz/.fasta.gz must be application/gzip

    .gz files aren’t ZIP archives. Use application/gzip (optionally emit contentEncoding: gzip in JSON metadata).

    -...encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: application/zip
    +...encodingFormat <a href=http://schema.org/encodingFormat> [?] </a> </span>: application/gzip

    Apply to sections:

    • 251–253, 255–256, 258–259, 261–262, 264–265, 267–268, 270–271, 273–274, 276–277, 279–280, 282–283, 288–289, 291–292, 294–295.

    Ensure the same mapping is produced by the metadata generator for corresponding JSON entries.


    303-337: Fix MIME types and SSU/LSU wording in taxonomy-summary

    Correct typos and wrong encodingFormat values for TSV/TXT/HTML/BIOM entries.

    -                    <h2>LSU sequences used for indentification</h2>
    +                    <h2>SSU sequences used for identification</h2>
    -...encodingFormat ...: application/zip
    +...encodingFormat ...: application/gzip
    -...name ...: LSU sequences used for indentification
    +...name ...: SSU sequences used for identification
    -...ERR599171.merged_SSU.fasta.mseq.tsv...</li><li><span class="property">encodingFormat...</span>: application/json-ld
    +...ERR599171.merged_SSU.fasta.mseq.tsv...</li><li><span class="property">encodingFormat...</span>: text/tab-separated-values
    -...ERR599171.merged_SSU.fasta.mseq.txt...</li><li><span class="property">encodingFormat...</span>: application/json-ld
    +...ERR599171.merged_SSU.fasta.mseq.txt...</li><li><span class="property">encodingFormat...</span>: text/plain
    -.../SSU/krona.html...</li><li><span class="property">encodingFormat...</span>: application/html
    +.../SSU/krona.html...</li><li><span class="property">encodingFormat...</span>: text/html
    -...mseq_hdf5.biom...</li><li><span class="property">encodingFormat...</span>: application/json-ld
    +...mseq_hdf5.biom...</li><li><span class="property">encodingFormat...</span>: application/x-hdf5
    -...mseq_json.biom...</li><li><span class="property">encodingFormat...</span>: application/json-ld
    +...mseq_json.biom...</li><li><span class="property">encodingFormat...</span>: application/biom+json

    Mirror the same corrections for LSU entries at 321–337 (gzip for .gz; TSV/TXT/HTML MIME fixes; HDF5/BIOM types).

    Please propagate the same fixes to resources/ro-crate-metadata-out.json.


    356-361: encodingFormat for .tsv.gz must be application/gzip

    Two compressed TSVs incorrectly use application/zip.

    -...ERR599171.merged.hmm.tsv.gz...</li><li><span class="property">encodingFormat...</span>: application/zip
    +...ERR599171.merged.hmm.tsv.gz...</li><li><span class="property">encodingFormat...</span>: application/gzip
    -...ERR599171.merged_CDS.I5.tsv.gz...</li><li><span class="property">encodingFormat...</span>: application/zip
    +...ERR599171.merged_CDS.I5.tsv.gz...</li><li><span class="property">encodingFormat...</span>: application/gzip

    372-379: Fix typos: “Geno Ontology” and “statistcs”

    Use “Gene Ontology” and “statistics”.

    -                    <h2>Geno Ontology summary statistics</h2>
    +                    <h2>Gene Ontology summary statistics</h2>
    -...name ... : Geno Ontology summary statistics
    +...name ... : Gene Ontology summary statistics
    -                    <h2>Pfam summary statistcs</h2>
    +                    <h2>Pfam summary statistics</h2>
    -...name ... : Pfam summary statistcs
    +...name ... : Pfam summary statistics

    395-397: Fix broken list markup for @type and normalize hasPart links

    Keep multiple types within one list item and ensure linked hasPart items are consistent anchors.

    -                    <ul><li><span class="property"> @id </span>: #workflow/metaGOflow</li><li><span class="property"> @type </span>: File</li>,  SoftwareSourceCode</li>,  ComputationalWorkflow</li><li><span class="property">affiliation <a href=http://schema.org/affiliation> [?] </a> </span>: University of Flatland</li><li><span class="property">author <a href=http://schema.org/author> [?] </a> </span>: EMO BON</li><li><span class="property">hasPart <a href=http://schema.org/hasPart> [?] </a> </span>: <a href=#config.yml> MetaGOflow configuration file </a></li>,  TARA_OCEANS_SAMPLE_3steps/ERR599171.yml</li><li><span class="property">license <a href=http://schema.org/license> [?] </a> </span>: <a href=#https://www.apache.org/licenses/LICENSE-2.0> Apache License 2.0 </a></li><li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: metaGOflow</li><li><span class="property">url <a href=http://schema.org/url> [?] </a> </span>: <a href=https://github.com/emo-bon/MetaGOflow/releases/tag/v1.0.0> https://github.com/emo-bon/MetaGOflow/releases/tag/v1.0.0 </a></li></ul>
    +                    <ul>
    +                      <li><span class="property"> @id </span>: #workflow/metaGOflow</li>
    +                      <li><span class="property"> @type </span>: File, SoftwareSourceCode, ComputationalWorkflow</li>
    +                      <li><span class="property">affiliation <a href=http://schema.org/affiliation> [?] </a> </span>: University of Flatland</li>
    +                      <li><span class="property">author <a href=http://schema.org/author> [?] </a> </span>: EMO BON</li>
    +                      <li><span class="property">hasPart <a href=http://schema.org/hasPart> [?] </a> </span>: <a href="#config.yml">MetaGOflow configuration file</a>, <a href="#ERR599171.yml">TARA_OCEANS_SAMPLE_3steps/ERR599171.yml</a></li>
    +                      <li><span class="property">license <a href=http://schema.org/license> [?] </a> </span>: <a href="#https://www.apache.org/licenses/LICENSE-2.0">Apache License 2.0</a></li>
    +                      <li><span class="property">name <a href=http://schema.org/name> [?] </a> </span>: metaGOflow</li>
    +                      <li><span class="property">url <a href=http://schema.org/url> [?] </a> </span>: <a href="https://github.com/emo-bon/MetaGOflow/releases/tag/v1.0.0">https://github.com/emo-bon/MetaGOflow/releases/tag/v1.0.0</a></li>
    +                    </ul>

    Guide.md (6)

    32-32: Fix typo: “downloable PNG” → “downloadable PNG”

    Minor wording correction.

    -... the corresponding downloable PNG.
    +... the corresponding downloadable PNG.

    53-54: Tighten publisher/sdPublisher wording and entity types

    Use “an Organization”, simplify contactPoint guidance, and capitalize mailto consistently.

    -- **Publisher and sdPublisher Handling**: Create an Organization entity with the respective ID, then set the name and description of the organization, where the name and the description may have to be searched or recorded online. Also, set the contactPoint with usually the email following *mailto:*, e.g. **[mailto:database@gigasciencejournal.com](mailto:database@gigasciencejournal.com)**. Then, create a contactPoint entity with this respective ID, and set the contactType, email and identifier. For the case of the example ID, the email and identifier can share a plain string database@gigasciencejournal.com, while the contactType may be a plain string saying the contact of the publisher.
    +- **Publisher and sdPublisher Handling**: Create an Organization entity with the given @id, then set the organization’s name and description. Set contactPoint to a mailto: URL (e.g., mailto:database@gigasciencejournal.com). Create a ContactPoint entity with that @id and set contactType, email, and identifier.

    24-24: Clarify preview filename and detached package wording

    Align with RO‑Crate conventions; make @id/filename relationship explicit.

    -The metadata file always has ro-crate-metadata.json as the @id. The preview file has ro-crate-preview.html as the @id and filename. In detached package, i.e. the metadata file is not within the package, which is most likely for GigaDB, the filename ro-crate-metadata.json is renamed to xxxx-ro-crate-metadata.json, e.g. xxxx can be the dataset ID.
    +The metadata document always has ro-crate-metadata.json as its @id (and filename). The preview file has ro-crate-preview.html as its @id (and filename). In a detached package (i.e., when the metadata file is kept outside the package—common for GigaDB), rename ro-crate-metadata.json to <datasetId>-ro-crate-metadata.json and ensure the @id matches the filename.

    29-29: Split merged bullets and fix ISO 8601 wording in Step 1

    Two bullets are concatenated; dates phrasing is off.

    -- **Initialization of the Crate**: Create the empty crate, then set the profile to specify the context version and the root data entity ID.- **Initialization of the Root Data Entity**: Set ID, name, description, datePublished, i.e. the date of first publication, and sdDatePublished, i.e. the date on which the current structured data was generated or published. The dates are is ISO 8601 standard, e.g. YYYY-MM-DD. For GigaDB, the dataset is most likely to be web-based, the ID has to be an absolute URI, e.g. **[https://gigadb.org/dataset/102736](https://gigadb.org/dataset/102736)**.
    +- **Initialization of the Crate**: Create the empty crate, then set the profile to specify the context version and the root data entity @id.
    +- **Initialization of the Root Data Entity**: Set @id, name, description, datePublished (date of first publication), and sdDatePublished (date the structured data was generated/published). Dates must be in ISO 8601 format (e.g., YYYY-MM-DD). For GigaDB, the dataset @id must be an absolute URI (e.g., https://gigadb.org/dataset/102736).

    41-44: Fix PRONOM example formatting and localhost link

    Correct the JSON‑LD snippet and replace the localhost distribution URL with the public URL.

    -- **File**: Create a File entity with the respective ID, which has to be an absolute URI for a web-based entity. For a GigaDB dataset, it is most likely web-based, the ID is often selected to be the url that directly downloads the file. Then, we set the name, contentSize and encodingFormat. Note that the contentSize is either in kB or MB. Also, note that the encodingFormat is a plain string xxx/yyy, for instance, text/csv. In some cases that a more informative encodingFormat of the form xxx/yyy followed by a **[PRONOM](https://www.nationalarchives.gov.uk/PRONOM/Default.aspx)** identifier, for example, ["application/pdf", {"@id": **["https://www.nationalarchives.gov.uk/PRONOM/fmt/19"]("https://www.nationalarchives.gov.uk/PRONOM/fmt/19")**}]. Additionally, we can include some extra information including data types and file attributes using the exifData construct.
    +- **File**: Create a File entity with an absolute-URI @id (typically a direct download URL for GigaDB files). Then set name, contentSize, and encodingFormat. encodingFormat is a MIME type string (e.g., text/csv). If needed, you can pair it with a PRONOM identifier using a separate IRI:
    +  - encodingFormat: "application/pdf"
    +  - encodingFormatIRI: https://www.nationalarchives.gov.uk/PRONOM/fmt/19
    +  Additionally, include extra information (e.g., data types and file attributes) using the exifData construct.
     - **Directory/Dataset/zip file**: Create a Dataset entity with the respective ID, which has to be an absolute URI. Such URI should resolve to a listing of the content of the directory/dataset/zip file. For a GigaDB dataset, it is most likely a web-based zip file, the ID is often selected to be the url that shows its description, for example, **[https://gigadb.org/dataset/view/id/102736/Files_page/4](https://gigadb.org/dataset/view/id/102736/Files_page/4)**. Then, we set the name, description, distribution and releaseDate. Note that the distribution is the url that downloads the content, for example, **[https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/BoostNano-master.zip](http://127.0.0.1:5501/assets/ro-crate-preview.html#https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/BoostNano-master.zip)**. Also, note that the releaseDate should be in the ISO 8601 format. Furthermore, we can include some extra information including data types and file attributes using the exifData construct..
    +- **Directory/Dataset/zip file**: Create a Dataset entity with an absolute-URI @id that resolves to a listing of the content (e.g., **https://gigadb.org/dataset/view/id/102736/Files_page/4**). Then set name, description, distribution, and releaseDate. distribution should be the direct download URL (e.g., **https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/BoostNano-master.zip**). releaseDate must be ISO 8601. You may also include extra information using the exifData construct.

    56-57: Fix DOI example and capitalize “URL”

    Make value match the DOI @id and use “URL”.

    -- **Identifier Handling**: Create a contextual entity of type PropertyValue with the respective ID, then set the propertyID, value and url. For example, the propertyID is **[https://registry.identifiers.org/registry/doi](https://registry.identifiers.org/registry/doi)** given the ID of the identifier being a doi. In case of a doi's ID of **[https://doi.org/10.4225/59/59672c09f4a4b](https://doi.org/10.4225/59/59672c09f4a4b)**, the value is set to be a plain string of doi:10.5524/102736. The url is often chosen to be identical to the ID of the identifier.
    +- **Identifier Handling**: Create a contextual entity of type PropertyValue with the respective @id, then set propertyID, value, and URL. For example, propertyID can be https://registry.identifiers.org/registry/doi when the identifier is a DOI. If the identifier @id is https://doi.org/10.4225/59/59672c09f4a4b, set value to `doi:10.4225/59/59672c09f4a4b`. The URL is typically identical to the identifier’s @id.
    🧹 Nitpick comments (1)
    resources/ro-crate-preview-out.html (1)

    185-396: Prefer HTTPS for schema.org references

    All schema.org links use http; switch to https to avoid mixed-content/tooling warnings. Ideally fix at generator level.

    -<a href=http://schema.org/name>
    +<a href=https://schema.org/name>

    If this HTML is generated, update the emitter to use https for all schema.org and related vocab IRIs.

    Copy link

    @coderabbitai coderabbitai bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Actionable comments posted: 2

    ♻️ Duplicate comments (4)
    Guide.md (4)

    53-53: Tighten “Publisher and sdPublisher” phrasing; fix entity names

    Grammar/clarity pass; align with Organization and ContactPoint usage.

    -- **Publisher and sdPublisher Handling**: Create an Organization entity with the respective ID, then set the name and description of the organization, where the name and the description may have to be searched or recorded online. Also, set the contactPoint with usually the email following *mailto:*, e.g. **[mailto:database@gigasciencejournal.com](mailto:database@gigasciencejournal.com)**. Then, create a contactPoint entity with this respective ID, and set the contactType, email and identifier. For the case of the example ID, the email and identifier can share a plain string database@gigasciencejournal.com, while the contactType may be a plain string saying the contact of the publisher.
    +- **Publisher and sdPublisher Handling**: Create an Organization entity with the respective @id, then set the organization’s name and description (these may need to be sourced). Set contactPoint to a mailto: URL, e.g., **mailto:database@gigasciencejournal.com**. Then create a ContactPoint entity with that @id and set contactType, email, and identifier. In the example, email and identifier can both be `database@gigasciencejournal.com`, while contactType can describe the publisher’s contact.

    28-31: Split merged bullets; fix ISO 8601 wording and CC0 label

    The first two bullets are merged; also adjust date wording and license label.

    -## Step 1
    -- **Initialization of the Crate**: Create the empty crate, then set the profile to specify the context version and the root data entity ID.- **Initialization of the Root Data Entity**: Set ID, name, description, datePublished, i.e. the date of first publication, and sdDatePublished, i.e. the date on which the current structured data was generated or published. The dates are is ISO 8601 standard, e.g. YYYY-MM-DD. For GigaDB, the dataset is most likely to be web-based, the ID has to be an absolute URI, e.g. **[https://gigadb.org/dataset/102736](https://gigadb.org/dataset/102736)**.
    +## Step 1
    +- **Initialization of the Crate**: Create the empty crate, then set the profile to specify the context version and the root data entity @id.
    +- **Initialization of the Root Data Entity**: Set @id, name, description, datePublished (date of first publication), and sdDatePublished (date the structured data was generated/published). Dates must be in ISO 8601 format (e.g., YYYY‑MM‑DD). For GigaDB, the dataset is typically web‑based; the @id must be an absolute URI (e.g., https://gigadb.org/dataset/102736).
     - **Specification of the Components**: Specify the ID of the files, dataset such as zip file using hasPart, possibly using the \# directory construct to collectively describe many files. Refer to **[<b>Step 2</b>](#step-2)** for handling the data entities of these files and datasets and potentially any entities derived from them. Note that metadata file and the preview file, if it exists, are specially treated and not included in hasPart.
    --  **Specification of the License**: Specify the ID of the license using license, e.g. *[https://creativecommons.org/publicdomain/zero/1.0/](https://creativecommons.org/publicdomain/zero/1.0/)* for the CC0 v1.0 license. Refer to **[<b>Step 3</b>](#step-3)** for handling the contextual entity of the license.
    +-  **Specification of the License**: Specify the ID of the license using license, e.g. *[https://creativecommons.org/publicdomain/zero/1.0/](https://creativecommons.org/publicdomain/zero/1.0/)* for the CC0 1.0 license. Refer to **[<b>Step 3</b>](#step-3)** for handling the contextual entity of the license.
    -Note that metadata file and the preview file, if it exists, are specially treated and not included in hasPart.
    +Note that the metadata file and the preview file (if present) are specially treated and not included in hasPart.

    41-44: Fix PRONOM example formatting and broken distribution link

    JSON/Markdown is malformed; distribution URL points to localhost; remove double period.

    -- **File**: Create a File entity with the respective ID, which has to be an absolute URI for a web-based entity. For a GigaDB dataset, it is most likely web-based, the ID is often selected to be the url that directly downloads the file. Then, we set the name, contentSize and encodingFormat. Note that the contentSize is either in kB or MB. Also, note that the encodingFormat is a plain string xxx/yyy, for instance, text/csv. In some cases that a more informative encodingFormat of the form xxx/yyy followed by a **[PRONOM](https://www.nationalarchives.gov.uk/PRONOM/Default.aspx)** identifier, for example, ["application/pdf", {"@id": **["https://www.nationalarchives.gov.uk/PRONOM/fmt/19"]("https://www.nationalarchives.gov.uk/PRONOM/fmt/19")**}]. Additionally, we can include some extra information including data types and file attributes using the exifData construct.
    +- **File**: Create a File entity with an absolute‑URI @id (for web‑based files, typically a direct download URL). Then set name, contentSize, and encodingFormat. encodingFormat is a MIME type string (e.g., text/csv). If needed, pair it with a PRONOM identifier, for example: `["application/pdf", {"@id": "https://www.nationalarchives.gov.uk/PRONOM/fmt/19"}]`. Additionally, you can include extra information (e.g., data types and file attributes) using the exifData construct.
     - **Directory/Dataset/zip file**: Create a Dataset entity with the respective ID, which has to be an absolute URI. Such URI should resolve to a listing of the content of the directory/dataset/zip file. For a GigaDB dataset, it is most likely a web-based zip file, the ID is often selected to be the url that shows its description, for example, **[https://gigadb.org/dataset/view/id/102736/Files_page/4](https://gigadb.org/dataset/view/id/102736/Files_page/4)**. Then, we set the name, description, distribution and releaseDate. Note that the distribution is the url that downloads the content, for example, **[https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/BoostNano-master.zip](http://127.0.0.1:5501/assets/ro-crate-preview.html#https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/BoostNano-master.zip)**. Also, note that the releaseDate should be in the ISO 8601 format. Furthermore, we can include some extra information including data types and file attributes using the exifData construct..
    +- **Directory/Dataset/zip file**: Create a Dataset entity with an absolute‑URI @id that resolves to a content listing. For GigaDB, the @id is often the dataset page (e.g., **https://gigadb.org/dataset/view/id/102736/Files_page/4**). Then set name, description, distribution, and releaseDate. distribution should be the direct download URL (e.g., **https://s3.ap-northeast-1.wasabisys.com/gigadb-datasets/live/pub/10.5524/102001_103000/102736/BoostNano-master.zip**). releaseDate must be ISO 8601. You may include extra information using the exifData construct.
     - **Collective Construct with \#**: In case that we prefer describing some files or/and directories collectively, we create a Dataset entity with a local identifier as the ID, for example, \#other-files. Then, we set the name and description.

    56-57: Fix DOI example inconsistency; capitalize “URL”

    Align the DOI value with the DOI URL; improve property list commas.

    -- **Identifier Handling**: Create a contextual entity of type PropertyValue with the respective ID, then set the propertyID, value and url. For example, the propertyID is **[https://registry.identifiers.org/registry/doi](https://registry.identifiers.org/registry/doi)** given the ID of the identifier being a doi. In case of a doi's ID of **[https://doi.org/10.4225/59/59672c09f4a4b](https://doi.org/10.4225/59/59672c09f4a4b)**, the value is set to be a plain string of doi:10.5524/102736. The url is often chosen to be identical to the ID of the identifier.
    +- **Identifier Handling**: Create a contextual entity of type PropertyValue with the respective @id, then set propertyID, value, and URL. For example, propertyID can be **https://registry.identifiers.org/registry/doi** when the identifier is a DOI. If the identifier @id is **https://doi.org/10.4225/59/59672c09f4a4b**, set value to the plain string `doi:10.4225/59/59672c09f4a4b`. The URL is typically identical to the identifier’s @id.
    🧹 Nitpick comments (4)
    Guide.md (4)

    15-23: Polish “Note” section for precision

    Streamline phrasing; keep the add/removePropertyPair flags explicit.

    -## Note
    -The general rule is that we use the @id construct (<b>true</b> flag if using the add/removePropertyPair methods) when referring to another entity, we otherwise use a plain literal (<b>false</b> flag if using the add/removePropertyPair methods). There are exceptions for specific constructs not following the rules.
    -
    -Another reminder is to add the entity to the crate after the creation of the entity.
    -
    -Also, only one entity with the same ID has to be created.
    -
    -In addition, name of an entity should be human-readable if it exists.
    +## Note
    +Use @id when referencing another entity (set the add/removePropertyPair flag to <b>true</b>); otherwise use a plain literal (<b>false</b>). Some constructs are exceptions.
    +
    +Add each entity to the crate after creating it.
    +
    +Do not create multiple entities with the same @id.
    +
    +The name property should be human‑readable when present.

    24-24: Clarify detached package behavior; ensure @id matches filename

    Add explicit requirement to align @id with the renamed filename.

    -The metadata file always has ro-crate-metadata.json as the @id. The preview file has ro-crate-preview.html as the @id and filename. In detached package, i.e. the metadata file is not within the package, which is most likely for GigaDB, the filename ro-crate-metadata.json is renamed to xxxx-ro-crate-metadata.json, e.g. xxxx can be the dataset ID.
    +The metadata document always has ro-crate-metadata.json as its @id (and filename). The preview file has ro-crate-preview.html as its @id (and filename). In a detached package (i.e., the metadata file is not within the package—common for GigaDB), rename the metadata filename to `<datasetId>-ro-crate-metadata.json` and ensure the @id matches the filename.

    63-65: Grammar/precision in citation section

    Remove redundancy; fix “the another crate”; clarify multiplicity of authors.

    -- **Another Dataset/Crate**: Create a Publication entity of type CreativeWork with the respective ID, add an additional type of Dataset. Then, set the property conformsTo to be the version-less generic RO-Crate profile **[https://w3id.org/ro/crate](https://w3id.org/ro/crate)**. Note that we do not set hasPart and usually other properties for the entity representing the another crate, since its content and further metadata is available from its own RO-Crate Metadata Document.
    --- **A Paper**: Create a Publication entity of type ScholarlyArticle with the respective ID. then set the name. Also, set the author, identifier, issn, journal, datePublished and creditText, if any. Note that author can has more than one value and datePublished should be in ISO 8601 format.
    +- **Another Dataset/Crate**: Create a Publication entity of type CreativeWork with the respective @id and add `additionalType: "Dataset"`. Then set conformsTo to the version‑less RO‑Crate profile **https://w3id.org/ro/crate**. Do not set hasPart (and usually other properties) for the referenced crate, since its content and metadata are available from its own RO‑Crate Metadata Document.
    +- **A Paper**: Create a Publication entity of type ScholarlyArticle with the respective @id. Set name, author, identifier, issn, journal, datePublished, and creditText (if any). author may have multiple values; datePublished must be ISO 8601.

    67-68: Minor grammar and capitalization (ROR)

    Tighten phrasing and use “ROR”.

    -- **Author Handling**: Create a Person entity with the respective ID, then set the affiliation and the name. The affiliation should refer to an Organization entity. In case that such entity does not exist yet, we create an Organization entity with the respective ID, then set the name, where the name may have to be searched or recorded online. For a GigaDB dataset, a ror is often picked as the ID for the organization, for instance, **[https://ror.org/01ej9dk98](https://ror.org/01ej9dk98)**.
    +- **Author Handling**: Create a Person entity with the respective @id, then set affiliation and name. affiliation should reference an Organization entity. If that Organization does not exist yet, create it and set its name (which may need to be sourced). For a GigaDB dataset, a ROR is often used as the organization @id (e.g., **https://ror.org/01ej9dk98**).
    📜 Review details

    Configuration used: CodeRabbit UI

    Review profile: CHILL

    Plan: Pro

    💡 Knowledge Base configuration:

    • MCP integration is disabled by default for public repositories
    • Jira integration is disabled by default for public repositories
    • Linear integration is disabled by default for public repositories

    You can enable these sources in your CodeRabbit configuration.

    📥 Commits

    Reviewing files that changed from the base of the PR and between 0b0c7e6 and 464119b.

    📒 Files selected for processing (1)
    • Guide.md (1 hunks)
    🧰 Additional context used
    🪛 LanguageTool
    Guide.md

    [grammar] ~2-~2: There might be a mistake here.
    Context: # Usage Guide for ro-crate-php Below are some notes to pay attention to...

    (QB_NEW_EN)


    [grammar] ~3-~3: There might be a mistake here.
    Context: ...e is 1.2-oriented, and some constraints in standard of RO-Crate 1.1 are no longer ...

    (QB_NEW_EN)


    [grammar] ~7-~7: There might be a mistake here.
    Context: ...n the standard of 1.1. --- ## Overview This is a PHP tool to create and manipul...

    (QB_NEW_EN)


    [grammar] ~8-~8: There might be a mistake here.
    Context: ... not be perfect but ought to be able to provide sufficient description of the dataset. ...

    (QB_NEW_EN)


    [grammar] ~10-~10: There might be a mistake here.
    Context: ...he repository above. Version: [1.0] Last Updated: [2025-08-18] --- ## No...

    (QB_NEW_EN)


    [grammar] ~22-~22: There might be a mistake here.
    Context: ... same ID has to be created. In addition, name of an entity should be human-reada...

    (QB_NEW_EN)


    [grammar] ~24-~24: There might be a mistake here.
    Context: ...e-preview.html as the @id and filename. In detached package, i.e. the metadata fil...

    (QB_NEW_EN)


    [grammar] ~28-~28: There might be a mistake here.
    Context: ...x can be the dataset ID. --- ## Step 1 - Initialization of the Crate: Create th...

    (QB_NEW_EN)


    [grammar] ~29-~29: There might be a mistake here.
    Context: ...6](https://gigadb.org/dataset/102736)**. - Specification of the Components: Speci...

    (QB_NEW_EN)


    [grammar] ~30-~30: There might be a mistake here.
    Context: ...ecify the ID of the files, dataset such as zip file using hasPart, possibly using ...

    (QB_NEW_EN)


    [style] ~30-~30: Try moving the adverb to make the sentence clearer.
    Context: ...asPart, possibly using the # directory construct to collectively describe many files. Refer to Step 2 ...

    (SPLIT_INFINITIVE)


    [grammar] ~30-~30: There might be a mistake here.
    Context: ...ly any entities derived from them. Note that metadata file and the preview file, if ...

    (QB_NEW_EN)


    [grammar] ~30-~30: There might be a mistake here.
    Context: ...lly treated and not included in hasPart. - Specification of the License: Specify ...

    (QB_NEW_EN)


    [grammar] ~35-~35: There might be a mistake here.
    Context: ...ation used with different file entities. - Specification of Citation: In case the...

    (QB_NEW_EN)


    [grammar] ~36-~36: There might be a mistake here.
    Context: ...as to be a URL (for example a DOI URL). In case of citing another dataset/crate, t...

    (QB_NEW_EN)


    [grammar] ~39-~39: There might be a mistake here.
    Context: ...n string that concatenates all keywords with comma as the delimiter. As a special co...

    (QB_NEW_EN)


    [grammar] ~39-~39: There might be a mistake here.
    Context: ...pecial construct, together with the use of keywords property, we have to specify t...

    (QB_NEW_EN)


    [grammar] ~42-~42: There might be a mistake here.
    Context: ...attributes using the exifData construct. - Directory/Dataset/zip file: Create a D...

    (QB_NEW_EN)


    [grammar] ~43-~43: There might be a mistake here.
    Context: ...ve ID, which has to be an absolute URI. Such URI should resolve to a listing of the ...

    (QB_NEW_EN)


    [grammar] ~43-~43: There might be a mistake here.
    Context: ...attributes using the exifData construct.. - Collective Construct with #: In case ...

    (QB_NEW_EN)


    [grammar] ~49-~49: There might be a mistake here.
    Context: ... searched or recorded online. ## Step 4 - Thumbnail Handling: When the thumbnail...

    (QB_NEW_EN)


    [grammar] ~52-~52: There might be a mistake here.
    Context: ...ntity with the respective ID. ## Step 5 - Publisher and sdPublisher Handling: Cr...

    (QB_NEW_EN)


    [grammar] ~55-~55: There might be a mistake here.
    Context: ...the contact of the publisher. ## Step 6 - Identifier Handling: Create a contextu...

    (QB_NEW_EN)


    [grammar] ~56-~56: There might be a mistake here.
    Context: ...n the ID of the identifier being a doi. In case of a doi's ID of **[https://doi.or...

    (QB_NEW_EN)


    [grammar] ~58-~58: There might be a mistake here.
    Context: ... to the ID of the identifier. ## Step 7 - exifData Handling: Create a contextual...

    (QB_NEW_EN)


    [style] ~63-~63: This phrase might be redundant. Consider either removing or replacing the adjective ‘additional’.
    Context: ...pe CreativeWork with the respective ID, add an additional type of Dataset. Then, set the property conf...

    (ADD_AN_ADDITIONAL)


    [grammar] ~63-~63: Ensure spelling is correct
    Context: ...perties for the entity representing the another crate, since its content and further me...

    (QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


    [grammar] ~63-~63: There might be a mistake here.
    Context: ...from its own RO-Crate Metadata Document. -- A Paper: Create a Publication ent...

    (QB_NEW_EN)


    [grammar] ~67-~67: There might be a mistake here.
    Context: ...to an Organization entity. In case that such entity does not exist yet, we create an...

    (QB_NEW_EN)


    [grammar] ~69-~69: There might be a mistake here.
    Context: ...ttps://ror.org/01ej9dk98)**. ## Step 10 - Funder Handling: Create an Organizatio...

    (QB_NEW_EN)


    [grammar] ~72-~72: There might be a mistake here.
    Context: ...he Awardee and the Award ID. ## Step 11 - About Handling: If the respective ID i...

    (QB_NEW_EN)

    🪛 markdownlint-cli2 (0.17.2)
    Guide.md

    53-53: Bare URL used

    (MD034, no-bare-urls)

    Comment on lines +7 to +9
    ## Overview
    This is a PHP tool to create and manipulate Research Object Crate. Please refer to the repository's *[README.md](https://github.com/gigascience/ro-crate-php/tree/main)* for more details. Below are the high-level steps instructing the creation of the metadata file for a GigaDB dataset from scratch. The created file may not be perfect but ought to be able to provide sufficient description of the dataset. An example created following the flow is in the assets directory of the repository above.

    Copy link

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    🛠️ Refactor suggestion

    Tighten overview; fix pluralization

    Improve clarity and grammar; “Research Object Crate” → “Research Object Crates”.

    -## Overview
    -This is a PHP tool to create and manipulate Research Object Crate. Please refer to the repository's *[README.md](https://github.com/gigascience/ro-crate-php/tree/main)* for more details. Below are the high-level steps instructing the creation of the metadata file for a GigaDB dataset from scratch. The created file may not be perfect but ought to be able to provide sufficient description of the dataset. An example created following the flow is in the assets directory of the repository above.
    +## Overview
    +This is a PHP tool to create and manipulate Research Object Crates. See the repository *[README.md](https://github.com/gigascience/ro-crate-php/tree/main)* for details. Below are high‑level steps for creating a GigaDB dataset’s metadata from scratch. The result may not be perfect but should sufficiently describe the dataset. An example following this flow is in the repository’s assets directory.
    📝 Committable suggestion

    ‼️ IMPORTANT
    Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

    Suggested change
    ## Overview
    This is a PHP tool to create and manipulate Research Object Crate. Please refer to the repository's *[README.md](https://github.com/gigascience/ro-crate-php/tree/main)* for more details. Below are the high-level steps instructing the creation of the metadata file for a GigaDB dataset from scratch. The created file may not be perfect but ought to be able to provide sufficient description of the dataset. An example created following the flow is in the assets directory of the repository above.
    ## Overview
    This is a PHP tool to create and manipulate Research Object Crates. See the repository *[README.md](https://github.com/gigascience/ro-crate-php/tree/main)* for details. Below are high-level steps for creating a GigaDB dataset’s metadata from scratch. The result may not be perfect but should sufficiently describe the dataset. An example following this flow is in the repository’s assets directory.
    🧰 Tools
    🪛 LanguageTool

    [grammar] ~7-~7: There might be a mistake here.
    Context: ...n the standard of 1.1. --- ## Overview This is a PHP tool to create and manipul...

    (QB_NEW_EN)


    [grammar] ~8-~8: There might be a mistake here.
    Context: ... not be perfect but ought to be able to provide sufficient description of the dataset. ...

    (QB_NEW_EN)

    🤖 Prompt for AI Agents
    In Guide.md around lines 7 to 9, the overview sentence is awkward and uses the
    singular “Research Object Crate”; update it to be tighter and pluralized.
    Replace the sentence "This is a PHP tool to create and manipulate Research
    Object Crate." with a concise version such as "This PHP tool creates and
    manipulates Research Object Crates." and ensure surrounding sentences flow
    grammatically (adjust articles/verbs if needed).
    

    - **Author Handling**: Create a Person entity with the respective ID, then set the affiliation and the name. The affiliation should refer to an Organization entity. In case that such entity does not exist yet, we create an Organization entity with the respective ID, then set the name, where the name may have to be searched or recorded online. For a GigaDB dataset, a ror is often picked as the ID for the organization, for instance, **[https://ror.org/01ej9dk98](https://ror.org/01ej9dk98)**.

    ## Step 10
    - **Funder Handling**: Create an Organization entity with the respective ID, then set the identifier, name and description. The identifier is always to be the same as the ID, and the description is Funding Body in this case. Additionally, we can use the exifData construct to include the information regarding the Awardee and the Award ID.
    Copy link

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    💡 Verification agent

    🧩 Analysis chain

    Verify “identifier is always the same as ID” guidance

    Schema.org/Organization.identifier is typically a PropertyValue or URL; equating it to @id may be misleading. Please confirm intended modeling.


    🌐 Web query:

    Is it correct per schema.org to set Organization.identifier equal to the entity’s @id URL, or should identifier be a PropertyValue (or URL) that may differ from @id? Provide authoritative references.
    

    💡 Result:

    Short answer: Use @id for the canonical URI of the Thing; use identifier when you need to supply an explicit identifier value (text, URL, or a PropertyValue with propertyID/value) that may be different from @id (e.g., company registration numbers, DOIs, SKUs). Schema.org recommends using the built‑in URI mechanisms (itemid/@id) for canonical URIs but supports identifier/PropertyValue for additional or non‑URI identifiers. See schema.org docs. [1][2][3]

    References

    • Schema.org Data model — notes on identifier vs built‑in URI (::prefer built‑in @id/itemid for canonical URIs; use identifier/PropertyValue for other identifier schemes). [1]
    • schema.org property "identifier" — types allowed and examples showing PropertyValue (propertyID + value) and URL/text usage. [2]
    • schema.org type "PropertyValue" — fields (propertyID, value) and guidance for representing identifier types. [3]

    Sources
    [1] https://schema.org/docs/datamodel.html
    [2] https://schema.org/identifier
    [3] https://schema.org/PropertyValue


    Clarify Organization.identifier usage in Guide.md line 70
    Replace “The identifier is always to be the same as the ID” with guidance that @id (itemid) serves as the canonical URI, while the identifier property is intended for additional or alternate identifiers (e.g., a PropertyValue, URL, or text) and may differ from @id (see schema.org Data model and identifier docs).

    🤖 Prompt for AI Agents
    In Guide.md around line 70, the sentence "The identifier is always to be the
    same as the ID" is incorrect; update the guidance to state that the entity's @id
    (itemid) is the canonical URI, while the identifier property is for additional
    or alternate identifiers (e.g., PropertyValue, URL, or plain text) and may
    differ from @id; replace the sentence with a concise note referencing
    schema.org's data model/identifier semantics and give an example usage: use @id
    as the canonical URI and populate identifier when you need external IDs or
    alternate identifiers.
    

    Copy link
    Collaborator

    @kencho51 kencho51 left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Hi @Alex-1919,

    issue: I got this error when I tried to execute the unit test:

     % docker-compose run test
    WARN[0000] /Volumes/kencho/ro-crate-php/docker-compose.yml: the attribute `version` is obsolete, it will be ignored, please remove it to avoid potential confusion 
    [+] Creating 1/0
     ✔ Container ro-crate-php-php-app-1  Running                                                                                                                                                                                  0.0s 
    PHPUnit 11.5.32 by Sebastian Bergmann and contributors.
    
    Runtime:       PHP 8.2.29
    
    ........................................Successful Creation of Preview file..F......                 49 / 49 (100%)
    
    Time: 00:00.187, Memory: 12.00 MB
    
    There was 1 failure:
    
    1) Tests\ROCrateTest::testCreationFromExisting
    Failed asserting that two strings are equal.
    --- Expected
    +++ Actual
    @@ @@
    -'./'
    +'https://gigadb.org/dataset/102736'
    
    /var/www/tests/ROCrateTest.php:70
    
    FAILURES!
    Tests: 49, Assertions: 159, Failures: 1.
    

    issue: There is no clear doc about how to spin up the php-app, here is what I did:

    % docker-compose build php-app
    % docker-compose exec -T php-app composer install
    % docker-compose run test
    

    question: After the app is up, when I visited http://localhost:8080/, the crate returned is for 102736, what should we do if we want to create crate another doi?

    suggestion: A CHANGELOG.md should be created, as because https://keepachangelog.com/en/1.0.0/

    @Alex-1919
    Copy link
    Collaborator Author

    Hi @kencho51,

    Regarding the 1st issue: I got this error when I tried to execute the unit test, which is not an actual error. You can comment it out to avoid it. I have pulled the repository to my local machine to run the unit tests using the command below:

    docker-compose run test

    and all the unit tests passed. Then, I tried running the code in index.html for the example dataset 102736 by starting the php-app. After that, if I run the tests again, the error you pointed out happens because we have overwritten the ro-crate-metadata.json file in the resources directory when starting the php-app, and that particular test relies on the original ro-crate-metadata.json file.

    Regarding the 2nd issue: How to start the php-app, after pulling the repository to the local device, I do the following steps:

    1. Run the command below to create the autoload files in vendor directories for my own local machine

    composer update

    1. Run the command below to start the php-app

    docker-compose up -d --build

    1. Refresh the page http://localhost:8080/ to ensure that the code in index.php are run
    2. In case of ending the php-app, run the command below

    docker-compose stop

    Regarding the question of designing and creating a crate for other GigaDB datasets with the tool, you can check the steps in Guide.md to create the crate using the information in the GigaDB website for that dataset. The example code in index.php for the dataset 102736 follows the steps in Guide.md as well.

    Regarding the suggestion to create a CHANGELOG.md, I will add that later.

    Copy link

    @coderabbitai coderabbitai bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Actionable comments posted: 2

    ♻️ Duplicate comments (14)
    README.md (4)

    3-3: Use the standard term “RO‑Crate”.

    Replace “Research Object Crate” with “RO‑Crate” for accuracy and consistency.

    -This is a PHP tool to create and manipulate Research Object Crate.
    +This is a PHP tool to create and manipulate RO‑Crate metadata.

    45-45: Tighten wording for addPropertyPair/removePropertyPair paragraph.

    Improve clarity and fix small grammar issues.

    -The methods of addPropertyPair and removePropertyPair abstract away the details about the file formatting. The user only has to provide the key for the property of the entity and the value to be added to or removed from the property. An optional boolean argument is to tell whether the value should be treated as a plain literal with false or as an identifier referencing other entity in the crate. By default, the flag is set to be comply with the previous value of this property if any.
    +The addPropertyPair and removePropertyPair methods hide file‑format details. Provide the property key and the value to add or remove. The optional boolean indicates whether the value is a plain literal (false) or an identifier referencing another entity (true). By default, the flag matches the previous value type for that property, if any.

    54-55: Fix typos and readability in the GigaDB example paragraph.

    Multiple misspellings (“illustarte”, “inforamtion”, “metadat”, “integribility”) and run‑ons.

    -An example based on GigaDB dataset 102736 is generated using the code in index.php to illustarte how the tool can be utilized with the general steps in **Guide.md** in the repository. The inforamtion to generate the metadata file is mainly from the *[Website](https://gigadb.org/dataset/102736)*. The example ro-crate-metadata.json and ro-crate-preview.html are stored in the assets directory. To make the file easier to read, only the first 2 files and a zipped file treated as a directory are included, where the remaining files are described using a collective directory construct. To ensure integribility and compatibility with other/downstream applications, the metadata file of this example is imported into **[RoHub](https://www.rohub.org/3543b082-9077-492e-a4c7-a3b7c8bb39e8?activetab=overview)** for testing, where the about property of the metadat descriptor and the id of the root data entity are replaced with ./ to be backward compatible to the RO-Crate 1.1 standard adopted by **[RoHub](https://www.rohub.org/about?what_is_rohub)**.
    +An example based on GigaDB dataset 102736 is generated using the code in `index.php` to illustrate how the tool can be used (see Guide.md). The information is mainly from the dataset website. The example `ro-crate-metadata.json` and `ro-crate-preview.html` are stored in the `assets/` directory. For readability, only the first two files and a ZIP file (treated as a directory) are included; remaining files are represented using a collective directory construct. For compatibility with downstream applications and RO‑Crate 1.1 (as used by RoHub), the `about` of the metadata descriptor and the ID of the root data entity are set to `./`.

    15-34: Convert examples to fenced PHP blocks; remove HTML escapes/tabs and add namespaces.

    Switch from blockquotes/
    /escaped dollars to fenced code. Also show correct PSR‑4 imports to avoid copy‑paste errors.

    -Create a new empty crate with the base path set to resources directory in the parent directory:
    -
    -> $crate = new  ROCrate(\_\_DIR\_\_  .  '/../resources', false);
    +Create a new empty crate with the base path set to the resources directory in the parent directory:
    +
    +```php
    +use ROCrate\ROCrate;
    +
    +$crate = new ROCrate(__DIR__ . '/../resources', false);
    +```
    @@
    -The `ROCrate` constructor enables the creation of a crate using an existing metadata file:
    -
    -> $crate = new ROCrate(\_\_DIR\_\_ . '/../resources', true);
    +Create a crate from an existing metadata file:
    +
    +```php
    +use ROCrate\ROCrate;
    +
    +$crate = new ROCrate(__DIR__ . '/../resources', true);
    +```
    @@
    -Add an entity to the crate:
    -> // A person
    -> <br> $author = $crate->createGenericEntity('#alice', ['Person']);
    -> <br> $author->addProperty('name', 'Alice Smith');
    -> <br> $author->addProperty('affiliation', 'Institution of Example');
    -> <br> // Add the person to the crate
    -> <br> \$crate->addEntity($author);
    ->
    -> // Adds the person as one of the creators of the root data entity, i.e. the dataset being described by the crate
    -> <br> $root = $crate->getRootDataset();
    -> <br> $root->addPropertyPair('creator', '#alice', true);
    +Add an entity to the crate:
    +
    +```php
    +$author = $crate->createGenericEntity('#alice', ['Person']);
    +$author->addProperty('name', 'Alice Smith');
    +$author->addProperty('affiliation', 'Institution of Example');
    +$crate->addEntity($author);
    +
    +$root = $crate->getRootDataset();
    +$root->addPropertyPair('creator', '#alice', true);
    +```

    Also applies to: 35-44

    resources/ro-crate-preview.html (4)

    7-7: Remove empty stylesheet link.

    Avoid a spurious request to the same page as CSS.

    -        <link rel="stylesheet" href="">
    +        <!-- no external stylesheet; styles are embedded -->

    186-187: Fix invalid list markup for hasPart and broken fragment hrefs.

    Anchors render outside the list item and several hrefs use “##id”. Keep all hasPart links within the same

  • and use href="#id".

    -... <span class="property">hasPart ...</span>: <a href=#https://...> readme_102736.txt </a></li>,  <a href=##...> other files </a><li><span class="property">author ...
    +... <span class="property">hasPart ...</span>:
    +  <a href="#https://...">readme_102736.txt</a>,
    +  <a href="#https://gigadb.org/dataset/view/id/102736/Files_page/4">BoostNano-master</a>,
    +  <a href="#other-files">other files</a>
    +  </li><li><span class="property">author ...

    Also consider rendering hasPart values as a nested

      to improve accessibility.


      197-197: Correct MIME type for text file.

      Use text/plain, not text/txt.

      -                    <li><span class="property">encodingFormat ... </span>: text/txt</li>
      +                    <li><span class="property">encodingFormat ... </span>: text/plain</li>

      207-213: Sanitize section ids: remove leading ‘#’ in id attributes.

      HTML id values should not include ‘#’; keep ‘#...’ only in displayed @id text.

      -                            <section id="#oneExtra">
      +                            <section id="oneExtra">
      @@
      -                    <ul><li><span class="property"> @id </span>: #oneExtra</li>
      +                    <ul><li><span class="property"> @id </span>: #oneExtra</li>

      Apply similarly for: twoExtra, zipExtra, other-files, Xuan_Yang, Benjamin_Reames, awardee, awardId, datasetTypes, additionalInfo1–8, etc. Update all internal links to use href="#oneExtra" style.

  • resources/ro-crate-preview-out.html (6)

    7-7: Remove empty stylesheet link.

    Prevents a redundant failing CSS request.

    -        <link rel="stylesheet" href="">
    +        <!-- styles are embedded -->

    371-379: Fix typos in labels.

    “Geno” → “Gene”; “Kegg” → “KEGG”; “statistcs” → “statistics”.

    -<h2>Geno Ontology summary statistics</h2>
    +<h2>Gene Ontology summary statistics</h2>
    @@
    -<h2>Kegg Ontology summary statistics</h2>
    +<h2>KEGG ontology summary statistics</h2>
    @@
    -<h2>Pfam summary statistcs</h2>
    +<h2>Pfam summary statistics</h2>

    Also update the corresponding “name” values.


    227-233: Headings render empty; add name/@id fallback in generator.

    If name is empty, fall back to @id (basename) for

    and “name”.

    Apply in ROCratePreviewGenerator: use a helper like displayName($entity) that returns trim(name) or basename(@id); then render that in titles and name fields.


    303-337: Fix incorrect MIME types for TSV/TXT/HTML/BIOM.

    Use standard media types.

    -<li><span class="property">encodingFormat ...</span>: application/json-ld</li>
    +<li><span class="property">encodingFormat ...</span>: text/tab-separated-values</li>   <!-- *.tsv -->
    @@
    -<li><span class="property">encodingFormat ...</span>: application/json-ld</li>
    +<li><span class="property">encodingFormat ...</span>: text/plain</li>                  <!-- *.txt -->
    @@
    -<li><span class="property">encodingFormat ...</span>: application/html</li>
    +<li><span class="property">encodingFormat ...</span>: text/html</li>                   <!-- *.html -->
    @@
    -<li><span class="property">encodingFormat ...</span>: application/json-ld</li>
    +<li><span class="property">encodingFormat ...</span>: application/x-hdf5</li>          <!-- *.mseq_hdf5.biom -->
    @@
    -<li><span class="property">encodingFormat ...</span>: application/json-ld</li>
    +<li><span class="property">encodingFormat ...</span>: application/json</li>            <!-- *.mseq_json.biom -->

    Apply to SSU and LSU blocks.


    251-271: .gz files are gzip, not zip.

    Change application/zip to application/gzip for all .gz entries.

    -<li><span class="property">encodingFormat ...</span>: application/zip</li>
    +<li><span class="property">encodingFormat ...</span>: application/gzip</li>

    Optionally also expose uncompressed type via a separate contentEncoding: gzip in the source metadata.

    Also applies to: 356-361


    395-397: Fix broken @type list and bad hasPart link; sanitize id.

    Keep @type values in one list item; link both hasPart items; remove leading ‘#’ from section id.

    -<section id="#workflow/metaGOflow">
    +<section id="workflow/metaGOflow">
    @@
    -<li><span class="property"> @type </span>: File</li>,  SoftwareSourceCode</li>,  ComputationalWorkflow</li>
    +<li><span class="property"> @type </span>: File, SoftwareSourceCode, ComputationalWorkflow</li>
    @@
    -<li><span class="property">hasPart ... </span>: <a href=#config.yml> MetaGOflow configuration file </a></li>,  TARA_OCEANS_SAMPLE_3steps/ERR599171.yml</li>
    +<li><span class="property">hasPart ... </span>: <a href="#config.yml">MetaGOflow configuration file</a>, <a href="#ERR599171.yml">TARA_OCEANS_SAMPLE_3steps/ERR599171.yml</a></li>

    Update all internal links to use the sanitized id (href="#workflow/metaGOflow").

    🧹 Nitpick comments (1)
    resources/ro-crate-metadata.json (1)

    350-355: Question: type “URL” used as @type.

    schema.org URL is a DataType; consider using WebPage/WebSite or omit the node and reference the URL directly.

    Comment on lines +47 to +50
    ## HTML Rendering
    Use the following code to generate a human-readable HTML preview from the RO-Crate Metadata File.
    > ROCratePreviewGenerator::generatePreview(\_\_DIR\_\_ . '/../resources');

    Copy link

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    🛠️ Refactor suggestion

    Render preview example as fenced code with namespace.

    Avoid blockquotes; show the fully qualified call.

    -Use the following code to generate a human-readable HTML preview from the RO-Crate Metadata File.
    -> ROCratePreviewGenerator::generatePreview(\_\_DIR\_\_ . '/../resources');
    +Use the following code to generate a human‑readable HTML preview from the RO‑Crate metadata file:
    +
    +```php
    +use ROCrate\ROCratePreviewGenerator;
    +
    +ROCratePreviewGenerator::generatePreview(__DIR__ . '/../resources');
    +```
    🤖 Prompt for AI Agents
    In README.md around lines 47 to 50, replace the blockquote example with a fenced
    PHP code block that shows the namespace import and the fully qualified call;
    specifically, present a short PHP snippet starting with the appropriate use
    statement for ROCrate\ROCratePreviewGenerator followed by the generatePreview
    call using __DIR__ . '/../resources', ensuring it is inside triple backticks
    with "php" language tag.
    

    ],
    "name": "readme_102736.txt",
    "contentSize": "9.30 kB",
    "encodingFormat": "text/txt",
    Copy link

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    ⚠️ Potential issue

    Fix encodingFormat for README text file.

    Use the registered MIME type text/plain.

    -            "encodingFormat": "text/txt",
    +            "encodingFormat": "text/plain",
    📝 Committable suggestion

    ‼️ IMPORTANT
    Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

    Suggested change
    "encodingFormat": "text/txt",
    "encodingFormat": "text/plain",
    🤖 Prompt for AI Agents
    In resources/ro-crate-metadata.json around line 146, the README file's
    encodingFormat is set to the nonstandard "text/txt"; replace that value with the
    registered MIME type "text/plain" so the metadata uses the correct
    encodingFormat.
    

    @Alex-1919 Alex-1919 requested a review from kencho51 September 15, 2025 06:00
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

    Labels

    None yet

    Projects

    Status: Changes Required

    Development

    Successfully merging this pull request may close these issues.

    3 participants