Skip to content

Drift correction#58

Merged
enryH merged 61 commits into
mainfrom
drift-correction
May 26, 2026
Merged

Drift correction#58
enryH merged 61 commits into
mainfrom
drift-correction

Conversation

@feliciaschulz
Copy link
Copy Markdown
Collaborator

@feliciaschulz feliciaschulz commented Feb 9, 2026

This pull request introduces a new API usage example for metabolomics data filtering and makes minor updates to the documentation and existing notebooks. The main addition is a comprehensive Jupyter notebook (and paired Python script) demonstrating how to filter metabolomics data using the acore package. There are also small metadata and index updates.

New API Example: Metabolomics Data Filtering

  • Added a new example notebook filter_metabolomics.ipynb and paired script filter_metabolomics.py demonstrating:
    • Loading and inspecting metabolomics data
    • Converting categorical columns to numeric types for analysis
    • Filtering features based on retention time (RT) and m/z value criteria
    • Summarizing and inspecting removed features after filtering [1] [2]

Documentation and Index Updates

  • Updated index.md to include the new filtering example and another drift correction example in the API usage section.
  • Minor metadata changes in normalization_analysis.py (Jupytext version update).
  • Updated cell IDs in batch_correction.ipynb for consistency; no content changes. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20]

These updates expand the documentation with a practical filtering workflow for metabolomics data and ensure the API examples are easily discoverable.<!--
Please complete the following sections when you submit your pull request.
Note that text within html comment tags will not be rendered.

Have a look at the CONTRIBUTING.md file in the root of the repository. -->

Summary

List of changes proposed in this PR (pull-request)

  • added api examples
  • loess and PCA based drift correction

Checks

feliciaschulz and others added 21 commits September 29, 2025 14:59
…29)

* 🚧 start collecting dataframe schemas with enrichment analysis result

* ✨ ensure valid numeric omics data functionality

* 🚧 add schema validation to examples

- can be added to function call using `pa.check_types`

* 🐛 add missing dependency

* 🐛 bool and boolean with NANs are not the same for pandera.

* 🐛 fix tests (convert dtypes expect bool)

* 🚧 Add and align differential analysis schema

- ANOVA descriptive statistis seems to have errors
- should missing values for p-values be allowed?

* 🐛 leave strings as they are

* 🚧 build schema on the fly for numeric data

- could be used to give specific feedback on user provided columns

* 🎨 add pandera to mapped types of other libraries

* 🚧 add Schemas to functions

- anova has two return types depending on number of groups...

* 🚧 add two schemas to anova function

Schemas should be unified eventually

* 📝 all commands for local execution of docs

... as integration test

* 🐛 remove testing shell script

to find errors in action runs (for automation)

* ✨ construct types for exploratory analysis

- create a separate PR to refactor exploratory analysis module

* 🎨 add Sebastians hint on BaseModel visualization

* 🎨 clean-up types

- formatting was not done properly in docs
- not used as a composite type as of now
* ✨ custom PR template for new module

* 🎨 highlight only folder name

* 📝 propose to use built-in virtual environment

* 📝 some more design hints

* 📝 any docstring type for functions is fine

* 🎨 remove intermediate heading

* 📝 add hint on using example data for api examples

* Update .github/workflows/PULL_REQUEST_TEMPLATE/module.md

---------

Co-authored-by: feliciaschulz <112621625+feliciaschulz@users.noreply.github.com>
* 🔧 remove constraint on numpy as latest inmoose does nto require it anymore

* 🔧 align github actions to python package template

... as for other librarier

* 🎨 format toml with even better toml

* 🔧 isort and ruff configuration

- turn ruff configuration on in a separate PR

* 🎨 format src folder

* 🎨 apply ruff check autofixes

* 🔥 remove tox artefacts

* 🚚 rename actions yaml file

* 🎨 isort tests
* 🚚 separate batch correction from normalization

- fix Move batch correction outside Normalisation 💄
Fixes #22

* 🐛 sync normalization nb and update test imports

* 🎨 format and remove unused import

* 🐛 sync nb

* 🐛 add batch correction to website index

* 🎨 link module api
* 🎨 move from rst to md

* 🚧 Close clean up conf.py
Fixes #21

- remove duplication
- sort code as in other packages
- double check if additional entries wer needed
* 🎨 consistent scaling and naming

- apply standardization before PCA (leads to minor change in axis)
- talk about batch correction, not normalization

* 🎨 shorten titles

* 🎨 order more meaningfully
* 📝 add Contributing.md as reference to PR template

* 🐛 make it an url
* 🚧 process metabolomics example data

* 🎨 format

* ✨ tested data with ANOVA, finish processing of data
@feliciaschulz feliciaschulz requested a review from enryH February 9, 2026 11:35
@feliciaschulz feliciaschulz marked this pull request as ready for review May 4, 2026 12:45
feliciaschulz and others added 19 commits May 12, 2026 10:33
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
…ving temp name column before returning df, raising ValueError with wrong QC threshold
- hide long source cells
- few typos found by copilot
- split long lines using rewrap tool
… Minor other changes in documentation of notebook.
@feliciaschulz feliciaschulz requested a review from enryH May 22, 2026 14:37
@enryH enryH mentioned this pull request May 26, 2026
4 tasks
Copy link
Copy Markdown
Collaborator

@enryH enryH left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a new issue #70 with some of the remaining topics for a new PR

@enryH enryH merged commit e334fb1 into main May 26, 2026
11 checks passed
@enryH enryH deleted the drift-correction branch May 26, 2026 10:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants