Our code includes performance test scripts for evaluating only-LLM methods, as well as scripts for refiner using LLMs.
test_only_llama.pytest_only_GPT.pytest_refiner.py
This code uses Llama-3-8B-Instruct to perform document-level relation extraction. The provided code demo is shown to process datasets, predict relations using the ATLOP SLM logits, and evaluate the results.
- Python 3.7+
- PyTorch
- Transformers
- scikit-learn
- tqdm
- pandas
- docre (custom module)
- c2net (custom module)
Install the required Python packages using pip:
pip install torch transformers scikit-learn tqdm pandas openpyxlEnsure the custom module docre is available in your Python path. Note that c2net is not necessary to use, you just need to replace all the paths with the c2net involved with your local paths.
- Prepare the Environment: Initialize the data context and set paths for datasets and pretrained models.
- Load Data and Models: Use the provided functions to load datasets, relation templates, and pre-trained model logits.
- Generate Prompts: Construct prompts and inputs for the model based on the loaded data.
- Run the Model: Use the LLaMA3-8B model to generate predictions.
- Evaluate Results: Save the model's predictions and evaluate them against the ground truth using the provided evaluation function.
Execute the script with Python:
python test_only_llama.pyThe script will process the data, generate prompts, run the model, and evaluate the results. Output predictions will be saved to dev_result_llama3_instruct_atlop.json.
The script prints example inputs and completions, showing the format of the processed data and the model's predictions.
INSTRUCTION: Read the DOCUMENT and answer the QUESTION. Write the answers in ANSWER.
DOCUMENT: ...
QUESTION: Which of the following is right?
...
ANSWER:
After running the script, the results are evaluated using the evaluate function, which compares the model's predictions with the ground truth and outputs performance metrics.
- Ensure the
dataset_path,pretrain_model_path, and other paths are correctly set according to your environment. - Modify the top-k variable to change the number of top predictions considered.
- The script is set to ignore warnings for cleaner output.
This project utilizes GPT-3 for document-level relation extraction. The provided code processes datasets, predicts relations using the ATLOP model, and evaluates the results.
- Python 3.7+
- PyTorch
- Transformers
- pandas
- tqdm
- OpenAI API key
- docre (custom module)
- c2net (custom module)
Install the required Python packages using pip:
pip install torch transformers pandas tqdm openaiEnsure the custom module docre is available in your Python path. Note that c2net is not necessary to use, you just need to replace all the paths with the c2net involved with your local paths.
- Set up OpenAI API Key: Set your OpenAI API key in the
get_completionfunction. - Prepare the Environment: Ensure the dataset and required files are available in the specified paths.
Execute the script with Python:
python test_only_GPT.pydataset_path: Path to the dataset directory.rel_templates_path: Path to the relationship templates directory.logits_path: Path to the logits directory.
- Load Data: Loads relation information, document data, and relation templates.
- Generate Prompts: Constructs prompts and inputs for the model based on the loaded data.
- Run the Model: Uses GPT-3 to generate predictions.
- Evaluate Results: Saves the model's predictions and evaluates them against the ground truth using the provided evaluation function.
The script prints example inputs and completions, showing the format of the processed data and the model's predictions.
##INSTRUCTION: Read the ##DOCUMENT and answer the ##QUESTION. Write the answers in ##ANSWER.
##DOCUMENT: ...
##QUESTION: Which of the following is right?
...
##ANSWER:
After running the script, the results are evaluated using the evaluate function, which compares the model's predictions with the ground truth and outputs performance metrics.
- Ensure the
dataset_path,rel_templates_path, andlogits_pathare correctly set according to your environment. - Modify the
TOP_Kvariable to change the number of top predictions considered. - The script is set to ignore warnings for cleaner output.
This project refines document-level relation extraction using llama3-8b-instruct. The script processes documents, refines relation extraction results, and evaluates the performance using a SLM original performance and refinement after LLM.
- Python 3.7+
- PyTorch
- Transformers
- TQDM
- Pandas
-
Install the required Python packages:
pip install torch transformers tqdm pandas
-
Clone this repository and navigate to the project directory.
-
Ensure your dataset is prepared and paths are correctly set in the script:
- Dataset Path:
c2net_context.dataset_path + "/dataset" - Relation Templates Path:
c2net_context.dataset_path + "/rel_templates" - DocRED Logits Path:
c2net_context.dataset_path + "/docred-logits"
- Dataset Path:
-
Set the pre-trained model path:
- Meta-Llama-3-8B-Instruct Path:
c2net_context.pretrain_model_path + "/Meta-Llama-3-8B-Instruct"
- Meta-Llama-3-8B-Instruct Path:
-
Set the output path:
c2net_context.output_path
-
Ensure the custom module
docreis available in your Python path. Note thatc2netis not necessary to use, you just need to replace all the paths with thec2netinvolved with your local paths.
Execute the script to process and refine the documents:
python test_refiner.pyThe script prints example inputs and completions, showing the format of the processed data and the model's predictions.
##INSTRUCTION: Read the ##DOCUMENT and answer the ##QUESTION. Write the answers in ##ANSWER.
##DOCUMENT: ...
##QUESTION: Which of the following is right?
...
##ANSWER:
After running the script, the results are evaluated using the evaluate function, which compares the model's predictions with the ground truth and outputs performance metrics.
- Ensure the
dataset_path,rel_templates_path, andlogits_pathare correctly set according to your environment. - Modify the
TOP_Kvariable to change the number of top predictions considered. - The script is set to ignore warnings for cleaner output.