Skip to content

add cellex#118

Open
leoxnard wants to merge 1 commit into
theislab:mainfrom
leoxnard:cellex
Open

add cellex#118
leoxnard wants to merge 1 commit into
theislab:mainfrom
leoxnard:cellex

Conversation

@leoxnard

Copy link
Copy Markdown

No description provided.

Copilot AI review requested due to automatic review settings May 10, 2026 17:53
@review-notebook-app

Copy link
Copy Markdown

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds CELLEX/CELLECT support to cellink.tl.external by introducing a new CELLEX runner for generating ESμ specificity matrices from AnnData, and a CELLECT-style linear model prioritization step that combines ESμ with MAGMA gene results. It also includes a tutorial notebook demonstrating the end-to-end workflow and a small MAGMA path-handling tweak.

Changes:

  • Add run_cellex and run_cellect_prioritization in a new src/cellink/tl/external/_cellex.py module.
  • Export the new APIs from cellink.tl.external.
  • Add a CELLEX/CELLECT tutorial notebook and resolve the MAGMA gene results output path.

Reviewed changes

Copilot reviewed 3 out of 4 changed files in this pull request and generated 9 comments.

File Description
src/cellink/tl/external/_magma.py Removes an unused top-level import and resolves MAGMA .genes.out output path.
src/cellink/tl/external/_cellex.py Implements CELLEX ESμ generation and CELLECT-style regression prioritization.
src/cellink/tl/external/__init__.py Re-exports the new CELLEX/CELLECT functions.
docs/tutorials/cellex.ipynb Adds a tutorial walking through MAGMA → CELLEX → CELLECT prioritization.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +1 to +13
import logging
import subprocess
from pathlib import Path
from typing import Literal, Union, Tuple

import numpy as np
import pandas as pd
from anndata import AnnData
import scanpy as sc
import statsmodels.api as sm
import statsmodels.tools.tools as sm_tools
import os

Comment on lines +55 to +63
prefix : str, optional
Prefix for output files. Default is "cellex".

Returns
-------
pd.DataFrame
ESmu DataFrame with genes as rows and cell types as columns.
Values in [0, 1] representing expression specificity.

Comment on lines +103 to +107
logger.info("Filtering cells and genes")
sc.pp.filter_cells(adata, min_genes=min_genes)
sc.pp.filter_genes(adata, min_cells=min_cells)

adata = adata[~adata.obs[cell_type_col].isna()].copy()
Comment on lines +112 to +121
expr_matrix = adata.X.toarray()
else:
expr_matrix = adata.X

data = pd.DataFrame(
expr_matrix.T,
index=adata.var_names,
columns=adata.obs_names,
)

logger.info("Mapping mouse Ensembl IDs to human Ensembl IDs")
cellex.utils.mapping.mouse_ens_to_human_ens(esmu, drop_unmapped=True, verbose=True)

out_file = f"{prefix}.esmu.csv"
Comment on lines +157 to +164

pval = ols_result.pvalues[1] / 2
pval = 1 - pval if ols_result.params[1] < 0 else pval

results.append({
"Name": f"{specificity_id}__{annotation}",
"Coefficient": ols_result.params[1],
"Coefficient_std_error": ols_result.bse[1],
return out_file


def fit_LM(specificity_id: str, es_mu: pd.DataFrame, df_magma: pd.DataFrame) -> pd.DataFrame:
Comment on lines +188 to +190
output_prefix : str
Prefix for output files. The final results will be saved as {output_prefix}_cellect_results.txt.
specificity_id : str, default="cellex"
Comment on lines 18 to 20
from ._magma import run_magma_pipeline
from ._cellex import run_cellex, run_cellect_prioritization

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants