Skip to content

train_embed() cannot update model with corpus_file #30

@hannahxchen

Description

@hannahxchen

Test code:

from embedding import SecWord2Vec

keywords = ['hello', 'sentence', 'meow']
model = SecWord2Vec(keywords, corpus_file='data/test.txt', min_count=1, size=10, iter=1)
model.train_embed()

print('corpus_file_1:')
for line in open('data/test.txt'):
	print(line.strip())

print('\nBefore update')
print('vocabs: {}'.format(model.wv.vocab.keys()))
print('keyword corpus: {}'.format(model.kc))
print('sentences:{}'.format(model.sentences))

print('\ncorpus_file_2:')
for line in open('data/test_2.txt'):
	print(line.strip())

model.train_embed(corpus_file='data/test_2.txt')
print('\nAfter update')
print('vocabs: {}'.format(model.wv.vocab.keys()))
print('keyword corpus: {}'.format(model.kc))
print('sentences:{}'.format(model.sentences))

Output:
screen shot 2018-12-01 at 10 53 19 pm

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions