Skip to content

PCA behavior #367

@chasemc

Description

@chasemc

Would it be okay to switch:

if n_components > pca_dimensions and pca_dimensions != 0:
logger.debug(
f"Performing decomposition with PCA (seed {seed}): {n_components} to {pca_dimensions} dims"
)
X = PCA(n_components=pca_dimensions, random_state=random_state).fit_transform(X)
# X = PCA(n_components='mle').fit_transform(X)
n_samples, n_components = X.shape

to adapt to a lower pca dimension when there aren't enough contigs/kmers

    if n_components > pca_dimensions and pca_dimensions != 0:
        if n_samples < pca_dimensions:
            logging.warning(f"n_samples ({n_samples}) is less than pca_dimensions ({pca_dimensions}), lowering pca_dimensions to {min(n_samples, n_components)} .")            
            pca_dimensions = min(n_samples, n_components)
        logger.debug(
            f"Performing decomposition with PCA (seed {seed}): {n_components} to {pca_dimensions} dims"
        )
        X = PCA(n_components=pca_dimensions, random_state=random_state).fit_transform(X)
        n_samples, n_components = X.shape

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions