-
Notifications
You must be signed in to change notification settings - Fork 3
Tree Similarity
Simone Ciccolella edited this page Jun 25, 2020
·
2 revisions
NOTE: To replicate this code it is necessary to clone the repository, the trees used here are located in mp3treesim/examples/trees.
$ mp3treesim examples/trees/tree10.gv examples/trees/tree3.gv
> 0.02347746030469402A Jupyter Notebook of this file is available at mp3treesim/examples/similarity.ipynb.
import mp3treesim as mp3
gv = '''
strict digraph G {
graph [name=G];
0 [label=root];
1 [label="C,M"];
0 -> 1;
2 [label="I,Q"];
0 -> 2;
3 [label=A];
0 -> 3;
1 -> 4;
2 -> 5;
6 [label=G];
3 -> 6;
7 [label="E,R"];
3 -> 7;
8 [label=F];
3 -> 8;
9 [label="L,N"];
4 -> 9;
10 [label="H,K"];
4 -> 10;
11 [label=K];
5 -> 11;
}
'''
tree1 = mp3.read_dotfile('trees/tree1.gv')
tree2 = mp3.read_dotstring(gv)mp3.draw_tree(tree1)
mp3.draw_tree(tree2)
print(mp3.similarity(tree1, tree2))0.05009996058703979
Let's consider the tree described in gv string as a partially-labeled tree
tree2_p = mp3.read_dotstring(gv, labeled_only=True)
mp3.draw_tree(tree2_p)
Let's compare the trees
print('Tree 1 vs Tree 2', mp3.similarity(tree1, tree2))
print('Tree 1 vs Tree 2p', mp3.similarity(tree1, tree2_p))
print('Tree 2 vs Tree 2p', mp3.similarity(tree2, tree2_p))Tree 1 vs Tree 2 0.05009996058703979
Tree 1 vs Tree 2p 0.061922490968955
Tree 2 vs Tree 2p 1.0
Let's exlude now some mutation from the computation, without having to modify the input. In particular let's remove mutations A,G,F,H from tree1
tree1_r = mp3.read_dotfile('trees/tree1.gv', exclude='A,G,F,H')
mp3.draw_tree(tree1_r)
print('Tree 1r vs Tree 2', mp3.similarity(tree1_r, tree2))
print('Tree 1r vs Tree 2p', mp3.similarity(tree1_r, tree2_p))
print('Tree 1 vs Tree 1r', mp3.similarity(tree1, tree1_r))Tree 1r vs Tree 2 0.018792140154100388
Tree 1r vs Tree 2p 0.02475323155433422
Tree 1 vs Tree 1r 0.88
MP3 tree similarity -- Version 1.0.6