Skip to content

On how to evaluate at BugsInPy and TypeBugs #4

@QingyuanLi1211

Description

@QingyuanLi1211

Dear author:
I see in your project that when you are evaluating on two benchmarks——BugsInPy and TypeBugs, you are using the function: gen_test_script('prompt_patches/bugsinpy/correctness_failed_cases.json', in the evaluate.py script, with split = 5, benchmark = “bugsinpy”) to generate sh file, which is then evaluated in PyTER's docker. But there's a bit of logic in there that I don't understand, how is the generated fixed patches, replaced into the original buggy function? I don't see where this part of the logic is implemented. After I follow the gen_test_script, it just generates some sh scripts in the folder. So I'm curious how the step of embedding the fixed patch into the buggy function is accomplished.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions