Skip to content

phamminh0811/Logic-LLM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Baselines

To replicate the Standard-LM (Direct) and the Chain-of-Thought (CoT) baselines, please run the following commands:

cd ./baselines
python gpt3_baseline.py \
    --api_key "Your OpenAI API Key" \
    --model_name "Model Name [text-davinci-003 | gpt-4]" \
    --dataset_name "Dataset Name [ProntoQA | ProofWriter | FOLIO | LogicalDeduction | AR-LSAT]" \
    --split dev \
    --mode "Baseline [Direct | CoT]" \
    --max_new_tokens "16 for Direct; 1024 for CoT" \

The results will be saved in ./baselines/results. To evaluate the results, please run the following commands:

python evaluate.py \
    --dataset_name "Dataset Name [ProntoQA | ProofWriter | FOLIO | LogicalDeduction | AR-LSAT]" \
    --model_name "Model Name [text-davinci-003 | gpt-4]" \
    --split dev \
    --mode "Baseline [Direct | CoT]" \

Logic Program Generation

To generate logic programs for logical reasoning problems in each dataset, at the root directory, run the following commands:

python models/logic_program.py \
    --api_key "Your OpenAI API Key" \
    --dataset_name "Dataset Name [ProntoQA | ProofWriter | FOLIO | LogicalDeduction | AR-LSAT]" \
    --split dev \
    --model_name "Model Name [text-davinci-003 | gpt-4]" \
    --max_new_tokens 1024 \

The generated logic programs will be saved in outputs/logic_programs. You can also reuse the logic programs we generated in ./outputs/logic_programs.

Logic Inference with Symbolic Solver

After generating logic programs, we can perform inference with symbolic solvers. At the root directory, run the following commands:

DATASET="Dataset Name [ProntoQA | ProofWriter | FOLIO | LogicalDeduction | AR-LSAT]"
SPLIT="Dataset Split [dev | test]"
MODEL="The logic programs are generated by which model? [text-davinci-003 | gpt-4]"
BACKUP="The random backup answer (random) or CoT-Logic collabration mode (LLM)"

python models/logic_inference.py \
    --model_name ${MODEL} \
    --dataset_name ${DATASET} \
    --split ${SPLIT} \
    --backup_strategy ${BACKUP} \
    --backup_LLM_result_path ./baselines/results/CoT_${DATASET}_${SPLIT}_${MODEL}.json

The logic reasoning results will be saved in outputs/logic_inferences.

Backup Strategies:

  • random: If the generated logic program cannot be executed by the symbolic solver, we will use random guess as the prediction.
  • LLM: If the generated logic program cannot be executed by the symbolic solver, we will back up to using CoT to generate the prediction. To run this mode, you need to have the corresponding baseline LLM results stored in ./baselines/results. To make the inference more efficient, the model will just load the baseline LLM results and use them as the prediction if the symbolic solver fails.

Evaluation

To evaluate the logic reasoning results, please run the following commands:

python models/evaluation.py \
    --dataset_name "Dataset Name [ProntoQA | ProofWriter | FOLIO | LogicalDeduction]" \
    --model_name "The logic programs are generated by which model? [text-davinci-003 | gpt-4]" \
    --split dev \
    --backup "The basic mode (random) or CoT-Logic collabration mode (LLM)"

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors