Computational Linguistic: replicate "A BERT Baseline for the Natural Questions"

This repository contains the implementation of a Question Answering (QA) system using the DistilBERT model from the Hugging Face transformers library. This system is designed to comprehend text context and provide answers to related questions, mirroring the capabilities demonstrated in the paper, "A BERT Baseline for the Natural Questions." The goal is to replicate the short answer and no answer results discussed in the paper, enhancing our understanding of BERT's application in real-world NLP tasks.

Paper Link: read paper here

Overview

This project aims to apply the principles of machine learning to natural language processing tasks specifically focused on understanding context and providing concise answers to questions. Our system tests the robustness of the DistilBERT model in identifying pertinent information and discerning when no adequate answer is available within the provided texts.

Data Description

The dataset used for this project has been curated to align closely with the specific objectives of replicating selected scenarios from the referenced paper. This reduced dataset encompasses only two of the paper's four annotated answer types, which are:

No answer: Instances where the questions cannot be answered with the provided context.
Short answer: Questions that have concise answers directly extractable or inferable from the context.

Data Link: see data here

Video

Here is the link for the video:

Installation

To set up and run the project locally, follow these steps:

Clone the repository

git clone https://github.com/YanmiYu/CSCI_1460_final.git

Navigate to the project directory

cd Computational Linguistic

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
Mia- F24 Project 2_ Bert QA Stencil		Mia- F24 Project 2_ Bert QA Stencil
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Computational Linguistic: replicate "A BERT Baseline for the Natural Questions"

Overview

Data Description

Video

Installation

Clone the repository

Navigate to the project directory

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Computational Linguistic: replicate "A BERT Baseline for the Natural Questions"

Overview

Data Description

Video

Installation

Clone the repository

Navigate to the project directory

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages