Skip to content

Dev#1

Open
ANugmanova wants to merge 3 commits intomasterfrom
dev
Open

Dev#1
ANugmanova wants to merge 3 commits intomasterfrom
dev

Conversation

@ANugmanova
Copy link
Copy Markdown
Owner

No description provided.

Comment thread test_tf_bag.py

train = pd.read_csv(os.path.join(os.path.dirname(__file__), 'data', 't.csv'), header=0, delimiter="\t", quoting=3) #открывается обучающий датасет

train = train[:5000]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

а почему и здесь и при обучении deepmoji все только по 5000 твит обрезается

Comment thread test_tf_bag.py
model = LinearSVC(penalty='l2', loss='squared_hinge', dual=True, tol=0.0001, C=1.0, multi_class='ovr',
fit_intercept=True, intercept_scaling=1, class_weight=None, verbose=0, random_state=None, max_iter=1000)
model.fit(X, y)
print ("20 Fold CV Score. Bag of words: ", np.mean(cross_validation.cross_val_score(model, X, y, cv=20, scoring='roc_auc')))
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

другая модель без кроссвалидации ведь проверяется?

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Да, но это просто старый код, я его не меняла.

Comment thread start_train.py

def train_model(nb_classes, DATASET_PATH, DATASET_PATH_PRETRAINED = '',
PRETRAINED_PATH='', delete_non_raws = False, save_model = False):
vocab = {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

а что вот это за слова, кстати?

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

это те теги, которые добавляются в препроцессинге у авторов

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

А для русского они тоже нужны?

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ну вот CUSTOM_URL и CUSTOM_NUMBER не зависят от языка, но по идее нужно будет проверить

Comment thread test_tf_bag.py

def review_to_wordlist( review, remove_stopwords=False ):
# review_text = BeautifulSoup(review).get_text()
review_text = review
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

)))

Comment thread data_tweets/convert.py
df['sent'].append(emoji_dict[emoji_name])
return df

df = {'text':[], 'id':[], 'sent':[]}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if name == 'main'

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

вот ты зануда)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

А вот и нет. Я импортнул отсюда словарь и у меня вышла ошибка, что какого-то файлика не хватает

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

А, ну я его для других целей создавала просто)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants