Collect, aggregate, and visualize a data ecosystem's metadata
-
Updated
Jun 29, 2026 - Java
Collect, aggregate, and visualize a data ecosystem's metadata
Official Repository of "LLM × DATA" Survey Paper
The official repository for the AiiDA code
Relational Workflows: where database schemas define executable data pipelines.
MetacatUI: A client-side web interface for DataONE data repositories
A curated list of awesome stuff around the FAIR principles for (scientific) data, i.e that data is findable, accessable, interoperable and re-usable.
ICDE 2025 Paper, Grounding Natural Language to SQL Translation with Data-Based Self-Explanations
Collect, aggregate, and visualize a data ecosystem's metadata
Turn Claude Code into an interactive scientific workspace: workflow, file system, and knowledge graph.
Graphical user interface for dtool and dserver written in Python and GTK3.
A React component to explore AiiDA provenance
An experimental framework for data provenance in the IoT based on smart contracts.
A self-hosted platform for orchestrating bioinformatics pipelines, managing experimental metadata, and running reproducible compute workloads.
Scientific Workflow Management Tool
Pachyderm pipeline example that automatically updates with GitHub Actions
Run Observation & Artifact Registration
面向研究、竞赛与论文场景的可追溯数据采集与交付工具
This project has moved. The latest version is tracked on Github at https://github.com/bioAF/bioAF
(ACL 2026 Main) LLMSurgeon recovers the pretraining data mixture of any LLM from only its generated text — no weights, no training data. A calibrated domain classifier plus label-shift correction de-blurs biased predictions. Ships with LLMScan, a benchmark on 8 open-source LLMs.
Add a description, image, and links to the data-provenance topic page so that developers can more easily learn about it.
To associate your repository with the data-provenance topic, visit your repo's landing page and select "manage topics."