Skip to content

model does not build new vocabularies after add_keyword_corpus() #29

@hannahxchen

Description

@hannahxchen

The model does not build new vocabs for new input words from add_keyword_corpus. The model would only build_vocab() during initialization (inside mixin class) and during training (if train_embed() has input sentences or keywords).

from embedding import SecWord2Vec

keywords = ['hello', 'cat']
sentences = ['hello world test', 'this is a cat']

model = SecWord2Vec(keywords, sentences, min_count=1, size=10)
print('After initialized:')
print('vocabs: {}'.format(model.wv.vocab.keys()))

model.add_keyword_corpus('dog', ['that is a dog', 'cats are adorable'])
print('\nAfter adding new keyword corpus:')
print('vocabs: {}'.format(model.wv.vocab.keys()))

model.train_embed()
print('\nAfter training:')
print('vocabs: {}'.format(model.wv.vocab.keys()))

screen shot 2018-12-01 at 7 47 14 pm

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions