Skip to content

YanmiYu/Computational-Linguistic

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 

Repository files navigation

Computational Linguistic: replicate "A BERT Baseline for the Natural Questions"

This repository contains the implementation of a Question Answering (QA) system using the DistilBERT model from the Hugging Face transformers library. This system is designed to comprehend text context and provide answers to related questions, mirroring the capabilities demonstrated in the paper, "A BERT Baseline for the Natural Questions." The goal is to replicate the short answer and no answer results discussed in the paper, enhancing our understanding of BERT's application in real-world NLP tasks.

Paper Link: read paper here

Overview

This project aims to apply the principles of machine learning to natural language processing tasks specifically focused on understanding context and providing concise answers to questions. Our system tests the robustness of the DistilBERT model in identifying pertinent information and discerning when no adequate answer is available within the provided texts.

Data Description

The dataset used for this project has been curated to align closely with the specific objectives of replicating selected scenarios from the referenced paper. This reduced dataset encompasses only two of the paper's four annotated answer types, which are:

  • No answer: Instances where the questions cannot be answered with the provided context.
  • Short answer: Questions that have concise answers directly extractable or inferable from the context.

Data Link: see data here

Video

Here is the link for the video:

Installation

To set up and run the project locally, follow these steps:

Clone the repository

git clone https://github.com/YanmiYu/CSCI_1460_final.git

Navigate to the project directory

cd Computational Linguistic

About

This is for the final Project 2: BERT Question-Answering System.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors