-
Notifications
You must be signed in to change notification settings - Fork 4
Open
Description
Description
I am currently unable to reproduce the results reported in the paper.
Below is the code I am using to compute the F1 score for the Weather dataset (located in dataset/weather):
from sklearn.metrics import f1_score
import pickle as pkl
import numpy as np
import os
city = 'hs'
city_full_name = {
'ny': 'New York City',
'hs': 'Houston',
'sf': 'San Francisco'
}
with open('indices.pkl', 'rb') as f:
indices = pkl.load(f)
with open(f'rain_{city}.pkl', 'rb') as f:
labels = pkl.load(f)
data_size = len(indices)
num_train = int(data_size * 0.6)
num_test = int(data_size * 0.2)
num_vali = data_size - num_train - num_test
seq_len_day = 1
idx_train = np.arange(num_train - seq_len_day)
idx_valid = np.arange(num_train - seq_len_day, num_train + num_vali - seq_len_day)
idx_test = np.arange(num_train + num_vali - seq_len_day, num_train + num_vali + num_test - seq_len_day)
res = []
for _i in idx_test:
i = indices[_i]
label = labels[_i]
with open(f'gpt_predict_text/{city}_{i}.txt', 'r') as f:
text = f.read()
if 'not rain' in text.lower():
pred = False
elif 'rain' in text.lower():
pred = True
else:
print(f"Invalid prediction: {text}")
continue
res.append((pred, label))
y_true = [label for _, label in res]
y_pred = [pred for pred, _ in res]
print(f1_score(y_true, y_pred, average='micro'))
The scores I obtain do not match the values reported in Table 2 of the paper. Could you please share the evaluation code used to produce the reported results?
Thanks for sharing the code and your work 🙏
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels