Skip to content

transfer to msmarco document dataset #22

@Berlin-98

Description

@Berlin-98

Hi~
I am using this repo to do experiment on msmacro document dataset, but i feel a little confuse about the difference between repos of Condenser, tevatron and coCondenser. I follow the guide of "coCondenser MS-MARCO Passage Retrieval" and try to transfer the data to msmacro document dataset and the checkpoint to condenser. I think if i want reproduce the result of the coCondenser paper, i just need to encode and then Index Search? is that right? If i want to transfer the data to marco document and the condenser checkpoint, i need to follow the steps of finetuning stage one and two? first finetune a checkpoint and save to retriever_model_s1/ and then use the trained checkpoint to mining hard negatives and then use the hard negatives to further finetune the model and save to retriever_model_s2, and finally search the result of dev set? is that right

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions