Feature/corpus analysis by flavioamieiro · Pull Request #183 · NAMD/pypln.backend

flavioamieiro · 2016-07-19T03:24:24Z

Adds worker to calculate the FreqDist of a corpus

This is the first draft of a worker that can get a corpus and create an analysis for it. This first attempt was a freqdist worker, that takes the freqdist for each document in the corpus and condensates it in a new analysis: the freqtdist for the entire corpus. This is a work in progress because I was mainly worried with the basis for this to work (specially the celery task). I did not pay any attention to the way the worker itself is working (it's probably doing more work than it needs to), and it also probably needs more tests.

geron · 2016-08-08T19:57:53Z

tests/test_worker_corpus_freqdist.py

+from utils import TaskTest
+
+
+class TestCorpusFreqDistWorker(TaskTest):


Wouldn't it be better to test PyPLNCorpusTask separately from CorpusFreqDist? Then later if another subclass of PyPLNCorpusTask is created only the returned dict would need to be checked.

Also, is this hitting an actual mongo instance? If so, would you consider mocking the db methods?

You are right. I was testing both in the same test case (and not testing correctly). I separated the tests and I think it's better now.

It is really hitting an actual mongo instance. This is inherited from the old days when MongoDict was still part of our codebase. It's also one of the reasons our tests are slow. I would be very glad to mock everything and have better, more isolated and quicker tests. I would probably need your help, though @geron :)

@geron

…ests Thanks @geron for pointing out that I was testing everything together

flavioamieiro added 2 commits June 16, 2016 01:11

Fixes Mongo corpora analysis collection configuration

71ca77d

flavioamieiro mentioned this pull request Jul 19, 2016

Feature/corpus analysis NAMD/pypln.web#140

Open

geron reviewed Aug 8, 2016
View reviewed changes

Separates the generic PyPLNCorpusTask tests from the CorpusFreqdist t…

2be0f7e

…ests Thanks @geron for pointing out that I was testing everything together

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Feature/corpus analysis#183

Feature/corpus analysis#183
flavioamieiro wants to merge 3 commits intoNAMD:developfrom
flavioamieiro:feature/corpus-analysis

flavioamieiro commented Jul 19, 2016

Uh oh!

geron Aug 8, 2016

Uh oh!

flavioamieiro Aug 11, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		from utils import TaskTest


		class TestCorpusFreqDistWorker(TaskTest):

Comments

Conversation

flavioamieiro commented Jul 19, 2016

Uh oh!

geron Aug 8, 2016

Choose a reason for hiding this comment

Uh oh!

flavioamieiro Aug 11, 2016

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants