On how to evaluate at BugsInPy and TypeBugs

Dear author:
I see in your project that when you are evaluating on two benchmarks——BugsInPy and TypeBugs, you are using the function: gen_test_script('prompt_patches/bugsinpy/correctness_failed_cases.json', in the evaluate.py script, with split = 5, benchmark = “bugsinpy”) to generate sh file, which is then evaluated in PyTER's docker. But there's a bit of logic in there that I don't understand, how is the generated fixed patches, replaced into the original buggy function? I don't see where this part of the logic is implemented. After I follow the gen_test_script, it just generates some sh scripts in the folder. So I'm curious how the step of embedding the fixed patch into the buggy function is accomplished.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

On how to evaluate at BugsInPy and TypeBugs #4

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

On how to evaluate at BugsInPy and TypeBugs #4

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions