Skip to content

Coverage of PB-VN mapping is not a strict subset of VerbNet-derived mapping #7

@aaronstevenwhite

Description

@aaronstevenwhite

When comparing pb-vn2.json to a mapping roleset-class mapping derived from VerbNet3.4 itself, I find that the ProbBank rolesets in the domain of each mapping are not in a subset relation with each other as might be expected.

To derive the mapping from VerbNet3.4, I use:

from collections import defaultdict
from verbnet import VerbNetParser

verbnet = VerbNetParser(version="3.4")
     
pb_vn34_map = defaultdict(set)

for cid, clsinfo in verbnet.verb_classes_numerical_dict.items():
    for m in clsinfo.members:
        for pbroleset in m.grouping:
            pb_vn34_map[pbroleset] |= {cid}

pb_vn34_map = dict(pb_vn34_map)

When compared to pb-vn2.json...

with open('semlink/instances/pb-vn2.json') as f:
    semlink_map = json.load(f)
    
pbset_from_verbnet = set(pb_vn34_map)
pbset_from_semlink = set(semlink_map)
    
print('In both SemLink and VerbNet:\t', len(pbset_from_semlink & pbset_from_verbnet))
print('In VerbNet but not SemLink:\t', len(pbset_from_verbnet - pbset_from_semlink))
print('In SemLink but not VerbNet:\t', len(pbset_from_semlink - pbset_from_verbnet))

I observe the following counts:

In both SemLink and VerbNet:	 1854
In VerbNet but not SemLink:	 1360
In SemLink but not VerbNet:	 2323

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions