Pre-requisites:
- SWI Prolog 8.3.19 or later
- boost 1.85 (other versions likely work)
- cmake
- ninja
Using EmCee with Pharos requires slight modifications. These modifications can
be found in the paper-2022-start tag. The
modified version, which fixes any discovered problems, can be found in the
paper-2022-virtanalyzer-end
tag. Clone one of these tags to a directory $PHAROS_DIR.
To compile EmCee, use cmake:
cmake -DPHAROS_DIR=$PHAROS_DIR -GNinja .
ninja
To compile with AFL:
cmake -DCMAKE_C_COMPILER=afl-gcc -DCMAKE_CXX_COMPILER=afl-g++ -DPHAROS_DIR=$PHAROS_DIR -GNinja .
env AFL_CXX=/path/to/g++-11 AFL_CC=/path/to/gcc-11 ninja
sourcecontains EmCee's front-endlayoutcontains the logic for computing class and data layoutcompilerhas logic for compiler-defined objects including tables and automatic methods (constructors, etc.)factsdeals with fact generationbest-testcasescontains the final testcases for the paper
./driver --help will display the available command-line options.
Run ./driver and follow the prompts.
Pick an input file (e.g., from the best-testcases directory).
Then, for example run:
./driver -v -c p.cc -p p.pl -l p.json < best-testcases/id\:000170\,time\:0\,execs\:0\,orig\:id\:006363\,sync\:SESSION003\,src\:006346 > emcee.log 2>&1
EmCee generates a lot of debug spew, so we generally look at the produced artifacts.
For example, p.cc is the C++ layout validation driver. It can be compiled
(with MSVC) and run to verify that the layout computed by EmCee agrees with
MSVC. It also contains comments showing EmCee's computed layout.
p.pl is the Prolog stub for testing OOAnalyzer and VirtAnalyzer rules. It's actually run automatically when invoking ./driver, so you can see the results in the emcee.log file:
cat emcee.log | fgrep RESULT, | head -n5
RESULT,TRUE,sound,CertainConstructorOrDestructor,"CertainConstructorOrDestructor(function(6))",""
RESULT,TRUE,sound,CertainConstructorOrDestructor,"CertainConstructorOrDestructor(function(7))",""
RESULT,TRUE,sound,CertainConstructorOrDestructor,"CertainConstructorOrDestructor(function(8))",""
RESULT,TRUE,sound,CertainConstructorOrDestructor,"CertainConstructorOrDestructor(function(10))",""
RESULT,TRUE,sound,CertainConstructorOrDestructor,"CertainConstructorOrDestructor(function(12))",""
For example, the first line indicates that the rule CertainConstructorOrDestructor is sound for function(6).
These files may be missing if there was insufficient input data to fully
instantiate a compilation. This is true of some of the files in
best-testcases because they came from AFL.
We used AFL++ with afl-utils.
Specifically, we used the scripts/fuzz.bash script, which uses afl-multicore
and the driver.conf and driver-v.conf scripts. After fuzzing, we use
afl-cmin and move the testcases into best-testcases.
After new fuzzing testcases are moved to best-testcases, we use
./scripts/prolog.bash from the best-testcases directory to compile each
testcase and test for soundness and completeness. By default, this produces a
CSV file in /tmp/prolog.csv.bz2 which one can search to identify problems.