Skip to content

annotate needs a num_chromosomes argument #410

@petrelharp

Description

@petrelharp

As pointed out on SLiM-discuss, right now using annotate for a multichromosome set-up with more than 8 chromosomes fails when reading into SLiM with the error

ERROR (Species::__TabulateSubpopulationsFromTreeSequence): unexpected node metadata length; this file cannot be read.

The reason is that the node metadata length increases with every 8 chromosomes, because of how vacancy is recorded.

The problem is that annotate_tables calls set_metadata_schema, which accepts the num_chromosomes arguments... but annotate_tables itself does not.

Here is a workaround, that demonstrates setting the top-level metadata
and node table metadata appropriately for a multichromosome simulation
(warning: this is NOT tested beyond verifying it runs!):

import msprime, pyslim, tskit

num_chroms = 10

big_ts = msprime.sim_ancestry(1, sequence_length=num_chroms)

tslist = [big_ts.keep_intervals([[k, k+1]]).trim() for k in range(num_chroms)]

top_md = pyslim.default_slim_metadata('tree_sequence')['SLiM']
chrom_md = top_md['this_chromosome']
chrom_md_list = []
for k in range(num_chroms):
    chrom_md.update({'id': k+1, 'index': k})
    chrom_md_list.append(chrom_md)

for k in range(num_chroms):
    ts = tslist[k]
    t = tslist[0].dump_tables()
    pyslim.annotate_tables(t, model_type='nonWF', tick=1)

    md = t.metadata
    md['SLiM']['chromosomes'] = chrom_md_list
    md['SLiM']['this_chromosome'] = chrom_md_list[k]
    t.metadata = md

    t.nodes.clear()
    t.nodes.metadata_schema = pyslim.slim_node_metadata_schema(num_chroms)
    for n in ts.nodes():
        md = pyslim.default_slim_metadata('node', num_chromosomes=num_chroms)
        t.nodes.append(n.replace(metadata=md))

    tslist[k] = t.tree_sequence()

# next, write out the tree sequences

This was actually kinda intentional: we have thus far provided no support for writing out multichromosome trees archives from pyslim, but the lack of the argument to annotate makes it extra hard.

Note: The "metadata" page's description for nodes in the docs needs to be updated as well.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions