Skip to content

nmehran/hash-env-docker

Repository files navigation

HashEnv: Performance Testing Environment

This project provides a Dockerized environment for benchmarking the performance of various C and C++ hash table libraries. It is managed by a powerful, cross-platform orchestration script to ensure simple setup and stable, reproducible results.


Key Features

  • Simple Orchestration: A single script (run_hash_env.sh) handles building the environment and hosting the environment with advanced CPU pinning options.
  • Cross-Platform: The orchestration script works consistently on Linux, macOS, and Windows.
  • Reproducible Results: The containerized environment ensures all benchmarks run with an identical OS, compiler, and library versions for reliable comparisons.
  • Broad Comparison: Natively supports a diverse set of libraries, including Boost, Folly, LLVM, Qt, and other popular implementations.
  • Extensible Framework: Easily add new libraries or custom benchmarks to the environment.

Environment Details

  • Docker Image Name: hash-env-image
  • Benchmark Project Directory: ./benchmarks (on host) and /opt/benchmarks (in container)
  • OS: Debian 12 (Bookworm) - Based on the official gcc:13.4.0 image.
  • Build Tools: CMake, Ninja, and the necessary compilers

This README.md guides you through the recommended workflow using the orchestration script, as well as advanced manual Docker commands.


Section A

Running Benchmarks (Recommended Method)

================================================================================

The run_hash_env.sh script is the primary entry point. It simplifies the entire process and provides powerful features for creating a stable benchmark environment. Please follow the instructions for your operating system below.

Platform-Specific Instructions

On Linux and macOS

Your system is ready to go. The default Terminal application is fully compatible.

  1. Open your Terminal.
  2. Navigate (cd) to the project directory.
  3. Run the commands as shown in the examples below.

On Windows (with WSL 2)

Windows users must use the Windows Subsystem for Linux (WSL) 2 backend for Docker Desktop. This provides a native Linux environment for seamless container execution and is the supported method for this project.

  1. Set Up Your Environment: Before you can run the benchmarks, your system must be configured correctly. Please follow the Windows Setup Instructions first.

  2. Open PowerShell: Once your environment is configured, all commands should be run as administrator from a PowerShell terminal (not Git Bash or the legacy Command Prompt).

  3. Run Commands with wsl: Navigate to the project directory and run the benchmark script by prefixing the command with wsl. This ensures the script executes inside the Linux environment.

    # Example of how to run the script correctly on Windows
    wsl ./run_hash_env.sh

This approach is more robust and ensures a true Linux environment for the benchmarks, avoiding common issues with pathing and volume mounting associated with compatibility layers like Git Bash.

Common Usage Examples

After following the platform-specific instructions above, run these commands in your terminal (prefix with wsl on Windows).

  1. Build the image and run the benchmark suite, reserving 2 cores for the system (the rest for the benchmark): This is the most common use case for getting stable results.

    ./run_hash_env.sh --build --system-cores 2 ./run_benchmarks.sh
  2. Build with Transparent Huge Pages (THP) enabled: To test the performance impact of THP, enable it during the build.

    ./run_hash_env.sh --build --enable-thp --system-cores 2 ./run_benchmarks.sh
  3. Launch an interactive shell for development: Use this to get a bash prompt inside the container to compile or debug manually.

    # Build the image first if it doesn't exist
    ./run_hash_env.sh --build
    
    # Then, enter the container
    ./run_hash_env.sh
  4. Run the benchmark with specific arguments passed to the C++ executable: Use -- to separate the script's arguments from the arguments for the command inside the container.

    # Run the benchmark on the last 8 cores, and tell the C++ program to use 8 threads (example of a pass-through argument)
    ./run_hash_env.sh --last-cores 8 ./run_benchmarks.sh -- --threads 8
  5. Run the built-in benchmarks without a local mount (CI use case): This tests the exact code that was COPY'd into the image during the build process.

    ./run_hash_env.sh --build --no-bind --system-cores 2 ./run_benchmarks.sh

General Options

  • --build: Build the Docker image before running.
  • --enable-thp: Enable Transparent Huge Pages (THP) during the image build. Requires --build.
  • --no-bind: Do not mount the local benchmark directory. Runs the code baked into the image instead (ideal for CI or verifying the final image).

CPU Pinning Options

These flags are mutually exclusive. Choose the one that best fits your goal.

  • --system-cores <N>: (Recommended) Reserves the first N cores for the OS, giving the rest to the container. Ideal for stability.
  • --last-cores <N>: Pins the container to the last N cores.
  • --cores <N>: Pins the container to the first N cores.

Section B

Advanced / Manual Docker Usage

================================================================================

For developers who prefer direct Docker commands or for specific CI/CD scenarios, the following manual steps are provided.

1. Build the Docker Image

Build the image once, or whenever the Dockerfile or its associated setup scripts change.

  docker build -t hash-env-image .

Initial build can be lengthy. Subsequent builds are faster due to Docker's layer caching.

2. Run One-Off Commands (Isolated Runs)

These commands create, run, and then automatically remove a container.

  • Run Default Benchmark Suite (/opt/benchmarks/run_benchmark.sh):

    docker run --rm -it --privileged hash-env-image

    The image's default CMD executes /opt/benchmarks/run_benchmark.sh.

  • Execute a Specific Script or Custom Command:

    docker run --rm -it --privileged -w /opt/benchmarks hash-env-image ./your_script.sh --arg

3. Create a Persistent Development Environment

For active development on the benchmark code (./benchmarks), a persistent container is ideal as it saves compiled objects and state between sessions. We'll name it hash-env-dev by convention.

  1. Create & Start Development Container (once):

    docker run -it --privileged --name hash-env-dev \
        -v "$(pwd)/benchmarks:/opt/benchmarks:rw" \
        -w /opt/benchmarks \
        hash-env-image bash
    • -v "$(pwd)/benchmarks:/opt/benchmarks:rw": Crucial for development. Mounts your local ./benchmarks into the container. Changes you make on your host are live inside the container.
    • This starts an interactive Bash shell. Type exit to stop the container.
  2. Re-enter an Existing Development Container: If hash-env-dev is stopped, first start it: docker start hash-env-dev Then, execute a shell in it:

    docker exec -it hash-env-dev bash

Updating the Environment vs. Benchmark Code

  • Changes to ./benchmarks (Your Benchmark Code): If using the development environment from Section B.3 (with the bind mount), changes are live. No image rebuild needed. Just re-run your build/benchmark commands inside the hash-env-dev container.

  • Changes to Dockerfile or Base Libraries (Image Layers):

    1. Rebuild the image: docker build -t hash-env-image .
    2. If you have a persistent development container (e.g., hash-env-dev), you must remove and recreate it from the new image to see the changes:
      docker stop hash-env-dev && docker rm hash-env-dev
      # Then re-run the command from B.3.

Common Docker Commands

================================================================================

  • docker ps [-a]: List running [all] containers.
  • docker logs <container>: View container logs.
  • docker stop <container>: Stop a running container.
  • docker rm <container>: Remove a stopped container.
  • docker images: List images.
  • docker rmi <image>: Remove an image.
  • docker system prune [-a]: Remove unused Docker data [all, including volumes/images]. Use cautiously.

About

A cross-platform Docker environment for reproducible performance benchmarking of C/C++ hash table libraries, including Boost, Folly, LLVM, and Qt.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors