Refiner

Overview

Our code includes performance test scripts for evaluating only-LLM methods, as well as scripts for refiner using LLMs.

test_only_llama.py
test_only_GPT.py
test_refiner.py

`test_only_llama.py`

Code Overview

This code uses Llama-3-8B-Instruct to perform document-level relation extraction. The provided code demo is shown to process datasets, predict relations using the ATLOP SLM logits, and evaluate the results.

Requirements

Python 3.7+
PyTorch
Transformers
scikit-learn
tqdm
pandas
docre (custom module)
c2net (custom module)

Installation

Install the required Python packages using pip:

pip install torch transformers scikit-learn tqdm pandas openpyxl

Ensure the custom module docre is available in your Python path. Note that c2net is not necessary to use, you just need to replace all the paths with the c2net involved with your local paths.

Usage

Prepare the Environment: Initialize the data context and set paths for datasets and pretrained models.
Load Data and Models: Use the provided functions to load datasets, relation templates, and pre-trained model logits.
Generate Prompts: Construct prompts and inputs for the model based on the loaded data.
Run the Model: Use the LLaMA3-8B model to generate predictions.
Evaluate Results: Save the model's predictions and evaluate them against the ground truth using the provided evaluation function.

Running the Script

Execute the script with Python:

python test_only_llama.py

The script will process the data, generate prompts, run the model, and evaluate the results. Output predictions will be saved to dev_result_llama3_instruct_atlop.json.

Example Output

The script prints example inputs and completions, showing the format of the processed data and the model's predictions.

INSTRUCTION: Read the DOCUMENT and answer the QUESTION. Write the answers in ANSWER.
DOCUMENT: ...
QUESTION: Which of the following is right?
...
ANSWER:

Evaluation

After running the script, the results are evaluated using the evaluate function, which compares the model's predictions with the ground truth and outputs performance metrics.

Notes

Ensure the dataset_path, pretrain_model_path, and other paths are correctly set according to your environment.
Modify the top-k variable to change the number of top predictions considered.
The script is set to ignore warnings for cleaner output.

`test_only_GPT.py`

Overview

This project utilizes GPT-3 for document-level relation extraction. The provided code processes datasets, predicts relations using the ATLOP model, and evaluates the results.

Requirements

Python 3.7+
PyTorch
Transformers
pandas
tqdm
OpenAI API key
docre (custom module)
c2net (custom module)

Installation

Install the required Python packages using pip:

pip install torch transformers pandas tqdm openai

Ensure the custom module docre is available in your Python path. Note that c2net is not necessary to use, you just need to replace all the paths with the c2net involved with your local paths.

Setup

Set up OpenAI API Key: Set your OpenAI API key in the get_completion function.
Prepare the Environment: Ensure the dataset and required files are available in the specified paths.

Usage

Running the Script

Execute the script with Python:

python test_only_GPT.py

Parameters

dataset_path: Path to the dataset directory.
rel_templates_path: Path to the relationship templates directory.
logits_path: Path to the logits directory.

Script Workflow

Load Data: Loads relation information, document data, and relation templates.
Generate Prompts: Constructs prompts and inputs for the model based on the loaded data.
Run the Model: Uses GPT-3 to generate predictions.
Evaluate Results: Saves the model's predictions and evaluates them against the ground truth using the provided evaluation function.

Example Output

The script prints example inputs and completions, showing the format of the processed data and the model's predictions.

##INSTRUCTION: Read the ##DOCUMENT and answer the ##QUESTION. Write the answers in ##ANSWER.
##DOCUMENT: ...
##QUESTION: Which of the following is right?
...
##ANSWER:

Evaluation

After running the script, the results are evaluated using the evaluate function, which compares the model's predictions with the ground truth and outputs performance metrics.

Notes

Ensure the dataset_path, rel_templates_path, and logits_path are correctly set according to your environment.
Modify the TOP_K variable to change the number of top predictions considered.
The script is set to ignore warnings for cleaner output.

`test_refiner.py`

This project refines document-level relation extraction using llama3-8b-instruct. The script processes documents, refines relation extraction results, and evaluates the performance using a SLM original performance and refinement after LLM.

Requirements

Python 3.7+
PyTorch
Transformers
TQDM
Pandas

Installation

Install the required Python packages:

pip install torch transformers tqdm pandas

Clone this repository and navigate to the project directory.

Usage

Prepare Data

Ensure your dataset is prepared and paths are correctly set in the script:
- Dataset Path: c2net_context.dataset_path + "/dataset"
- Relation Templates Path: c2net_context.dataset_path + "/rel_templates"
- DocRED Logits Path: c2net_context.dataset_path + "/docred-logits"
Set the pre-trained model path:
- Meta-Llama-3-8B-Instruct Path: c2net_context.pretrain_model_path + "/Meta-Llama-3-8B-Instruct"
Set the output path:
- c2net_context.output_path
Ensure the custom module docre is available in your Python path. Note that c2net is not necessary to use, you just need to replace all the paths with the c2net involved with your local paths.

Run the Script

Execute the script to process and refine the documents:

python test_refiner.py

Example Output

The script prints example inputs and completions, showing the format of the processed data and the model's predictions.

##INSTRUCTION: Read the ##DOCUMENT and answer the ##QUESTION. Write the answers in ##ANSWER.
##DOCUMENT: ...
##QUESTION: Which of the following is right?
...
##ANSWER:

Evaluation

After running the script, the results are evaluated using the evaluate function, which compares the model's predictions with the ground truth and outputs performance metrics.

Notes

Ensure the dataset_path, rel_templates_path, and logits_path are correctly set according to your environment.
Modify the TOP_K variable to change the number of top predictions considered.
The script is set to ignore warnings for cleaner output.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
docre		docre
README.md		README.md
prompt templates.pdf		prompt templates.pdf
test_only_GPT.py		test_only_GPT.py
test_only_llama.py		test_only_llama.py
test_refiner.py		test_refiner.py

Drasick/Drell

Folders and files

Latest commit

History

Repository files navigation

Refiner

Overview

test_only_llama.py

Code Overview

Requirements

Installation

Usage

Running the Script

Example Output

Evaluation

Notes

test_only_GPT.py

Overview

Requirements

Installation

Setup

Usage

Running the Script

Parameters

Script Workflow

Example Output

Evaluation

Notes

test_refiner.py

Requirements

Installation

Usage

Prepare Data

Run the Script

Example Output

Evaluation

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`test_only_llama.py`

`test_only_GPT.py`

`test_refiner.py`

Packages