Incorrect pretraining data format for Factual Adapter

I have followed the [code](https://github.com/windweller/DisExtract/tree/master/preprocessing) here and generate all 3 tsv files under DisExtract/data/books/ALL18_2019jan02_[valid, train, test].tsv. However the format is not aligned with the required **json** file to run pretraining for Factual Adapter.  The format of the tsv is also different than the required json format as well. 

The content format of generated tsv file after executing `python producer.py` is as follows:

```
[Sentence 1]\t[Sentence 2]\t[Marker]
...
```

The required json file format should be as follows:

```
{ "sent" : "Sentence 1", "tokens": "sentence 2", "pairs" : [ ... ] }
...
```

Is there a conversion script that convert generated tsv format to json?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incorrect pretraining data format for Factual Adapter #2

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Incorrect pretraining data format for Factual Adapter #2

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions