Segmentation can be a little too aggressive

Consider a single array from Agilent Cytogenomics. I preprocessed it with two different methods:
1. rCGH
2. With `limma`, following the instructions in the first part of the `cghMCR` vignette

The two procedures produce vastly different number of segments after normalization and preprocessing. 1. yields 89 segments, while 2. yields approximately 300 segments.

The downside then lies when you have specific regions / genes to check. Even using GC correction and ensuring the right peak is used for EM normalization, the log2ratios in 1. are higher than 2. by at least a factor of 1 in log scale. This can lead to bogus copy number estimations, as the validations showed a much lower copy number (between 3 and 4, closer to the estimate made by 2., while 1. was almost 10).

The main reason is that there are a lot fewer segments in 1. than in 2. and that skews calculations. Setting the distance to join segments to 0 (from 10kbp default) doesn't improve the situation.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Segmentation can be a little too aggressive #5

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Segmentation can be a little too aggressive #5

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions