Skip to content

Latest commit

 

History

History
91 lines (59 loc) · 7.95 KB

File metadata and controls

91 lines (59 loc) · 7.95 KB

Contributing to DataLab-Web

Thanks for your interest in contributing! This document focuses on the project-specific rules every contributor must follow. For build, test and release commands, see the README.

Generative AI policy

DataLab-Web is funded under an NLnet grant and complies with the NLnet policy on the use of Generative AI. This section states how Generative AI (GenAI) is used in the project and what contributors are expected to do.

Principles

  • Human responsibility is exclusive for high-level code review, architectural decisions, scientific validation of the Sigima algorithms, and any structural choice that engages the long-term direction of the project. GenAI does not make these decisions and does not absolve the human author from making them.
  • GenAI may assist on auxiliary tasks where the cost of human review remains low and the output is easy to verify: code exploration, boilerplate, first drafts of tests and documentation, comparing implementation variants, porting Qt-era patterns to React / TypeScript, log analysis.
  • Mandatory human review. No AI-generated content is committed without being read, understood and validated by a human author. The author of a commit is fully accountable for its contents, regardless of whether a model contributed.
  • No purely AI-generated work. As required by the NLnet policy, contributions consisting solely of AI-generated output without substantial human intellectual contribution are not accepted.
  • License compatibility. Any GenAI output integrated into the project must be compatible with the project's BSD 3-Clause license. Contributors must check the terms of use of the model they invoke and ensure outputs do not reproduce copyrighted or incompatible material.

Commit convention for AI-assisted contributions

Whenever a commit contains code (or other material) that was wholly or partially generated by a model, append an Assisted-by: trailer to the commit message identifying the model and its version:

fix(plot): clamp colormap range to finite values

Avoid NaN/Inf entries breaking Plotly's heatmap autoscale by filtering
the input array before computing the percentile bounds.

Assisted-by: Claude Opus 4.7

Multiple models can be listed with one trailer per line:

Assisted-by: Claude Opus 4.7
Assisted-by: GPT-5

The trailer is the only mandatory provenance marker. A more detailed log (prompts, raw outputs) is welcome for substantial contributions but is not required: the goal is to keep traceability lightweight enough that it is actually applied. Contributors must, however, be ready to answer questions about the GenAI use behind any commit they authored.

Commits without the trailer are deemed to contain no AI-assisted content.

Scope

This policy covers everything in the project repository: source code, tests, configuration, documentation and assets. It does not cover the runtime features of DataLab-Web that interact with LLMs (e.g. the in-app AI assistant under src/aiassistant) — those are product features whose behaviour is documented separately.

Commit messages

  • Use a Conventional Commits subject (feat:, fix:, refactor:, docs:, chore:, …) of at most 72 characters.
  • Keep messages short. Most commits in this repo have no body. Add a body of 0 to 5 lines max, and only to convey the why, a measurable impact (numbers, perf gains, suite duration) or a non-obvious design constraint — never to enumerate files or restate what the diff already shows.
  • Append the Assisted-by: <Model> <Version> trailer for AI-assisted commits (see above).
  • Group related changes in a single commit; split unrelated changes.

Code formatting

TypeScript, JavaScript, JSON, CSS, HTML and Markdown files are formatted with Prettier (defaults of Prettier 3, plus endOfLine: "auto"). Python files (under src/runtime/, tests/python/, scripts/) are formatted and linted with Ruff, at the same version as the sibling repos (DataLab, Sigima, …).

The recommended workflow:

  • Install the workspace's recommended VS Code extensions (Prettier, ESLint, Ruff) — VS Code will prompt you on first open thanks to .vscode/extensions.json. Format-on-save is then wired per language in .vscode/settings.json.

  • Install the Git pre-commit hooks once after cloning. They run Ruff (Python) and Prettier (everything else) on the staged files, matching what CI checks:

    pip install -r requirements-dev.txt   # provides the `pre-commit` CLI
    pre-commit install                    # wires .pre-commit-config.yaml into .git/hooks

    The hook uses the local node_modules/.bin/prettier, so make sure npm install has been run first. Skipping the hook with git commit --no-verify is tolerated for emergency fixes but CI will still reject unformatted files.

  • Reformat the whole repo at any time with npm run format (TS/JS/CSS/HTML/MD/JSON) or pre-commit run --all-files (Python + everything else).

  • CI runs npm run format:check and npm run lint as blocking steps (see tests.yml) and npm run release:pack runs them before lint/test/build, so any drift is caught early.

Internationalisation

DataLab-Web is fully internationalised: English is the source language and French is the first translated locale. The framework, the user-facing behaviour and the full contributor workflow are documented in the Internationalisation section of README.md. In short:

  • Wrap every new user-facing string in t("…") (from src/i18n/translate); use t("…{x}…", { x }) for interpolation. Never translate brand names or AI-assistant system prompts.
  • Run npm run i18n:extract to merge new keys into src/locales/fr.json, then fill in the translations.
  • Run npm run i18n:check to verify there are no missing or empty keys (it is wise to run this in CI).
  • Keys referenced only through a variable must be listed in src/locales/_dynamic-keys.json.

Temporary shims

DataLab-Web sometimes backports a feature or patches a bug that is fixed upstream (guidata, sigima, …) but not yet in a released wheel. These temporary shims are tracked centrally so they can be audited and removed once upstream catches up. Every backport shim is declared once in src/runtime/shims/registry.ts, carries # TEMPORARY SHIM / @shim-registry: <id> markers in its source, and is kept in sync by a network-free anti-drift test that runs in npm test. Run npm run audit:shims (or the 🔍 Audit shims (versions) task) to see which shims are now removable. Full workflow — adding, registering and removing a shim — is in doc/shim-registry.md.

Pull requests

  • Open a pull request against main.
  • Make sure the project still builds (npm run build), passes lint (npm run lint), formatting (npm run format:check) and tests (npm test, plus Playwright when UI behaviour changes — see doc/testing-strategy.md).
  • When you add or restructure a top-level subsystem (a new worker, a new bridge, a new Python helper module …), update doc/architecture.md in the same commit so the layer / component diagrams stay accurate.
  • User-visible changes: add a bullet under the [Unreleased] section of CHANGELOG.md, in the appropriate Added / Changed / Fixed / Removed group. The release script promotes that section to a versioned heading at tag time (and refuses to release if it is empty, unless --allow-empty-changelog is passed for tag-only / infrastructure releases). The in-app Help > Release notes dialog renders the file directly.
  • Keep commits focused; split unrelated changes.
  • Apply the GenAI commit convention above where relevant.