Skip to content

πŸ›‘οΈ Sentinel: [CRITICAL] Fix RCE and DoS risks in PDF compilation#215

Open
anchapin wants to merge 1 commit intomainfrom
fix-pdf-compilation-rce-dos-8044524251336920935
Open

πŸ›‘οΈ Sentinel: [CRITICAL] Fix RCE and DoS risks in PDF compilation#215
anchapin wants to merge 1 commit intomainfrom
fix-pdf-compilation-rce-dos-8044524251336920935

Conversation

@anchapin
Copy link
Copy Markdown
Owner

@anchapin anchapin commented Mar 28, 2026

🚨 Severity: CRITICAL
πŸ’‘ Vulnerability: The CoverLetterGenerator and PDFConverter compiled LaTeX to PDF without enforcing the -no-shell-escape flag for pdflatex and pandoc. In addition, the blocking calls to subprocess.Popen.communicate() lacked a timeout parameter.
🎯 Impact: This oversight created two critical security risks:

  1. Remote Code Execution (RCE): Attackers could inject LaTeX commands like \write18 into templates to execute arbitrary shell commands on the host server.
  2. Denial of Service (DoS): Maliciously crafted or excessively large inputs could cause pdflatex or pandoc to hang indefinitely, tying up system resources and causing a DoS.
    πŸ”§ Fix:
  • Added "-no-shell-escape" to pdflatex commands.
  • Added "--pdf-engine-opt=-no-shell-escape" to pandoc commands.
  • Added a timeout=30 to process.communicate() calls using the standard Python pattern to safely catch subprocess.TimeoutExpired, kill the process, and clean up to prevent zombie processes.
    βœ… Verification: pytest tests/test_pdf_security.py passes. All tests run locally pass. Added an entry to .jules/sentinel.md capturing these learnings.

PR created automatically by Jules for task 8044524251336920935 started by @anchapin

Summary by Sourcery

Harden PDF compilation by disabling LaTeX shell escapes and enforcing timeouts on external compilation processes to mitigate RCE and DoS risks.

Enhancements:

  • Add explicit "-no-shell-escape" flags to pdflatex invocations and equivalent options to pandoc-based PDF generation in both the cover letter generator and generic PDF converter.
  • Introduce 30-second timeouts and robust timeout handling for all LaTeX/PDF subprocess.communicate calls to prevent hanging processes.

Documentation:

  • Record the new PDF compilation security learnings and mitigations in the Sentinel security notes.

Added `-no-shell-escape` flags for `pdflatex` and `pandoc` in `CoverLetterGenerator` and `PDFConverter`.
Added 30-second timeouts to all `subprocess.communicate()` calls for PDF generation to prevent hanging.
Appended learning to `.jules/sentinel.md`.

Co-authored-by: anchapin <6326294+anchapin@users.noreply.github.com>
@google-labs-jules
Copy link
Copy Markdown
Contributor

πŸ‘‹ Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a πŸ‘€ emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

@sourcery-ai
Copy link
Copy Markdown

sourcery-ai bot commented Mar 28, 2026

Reviewer's Guide

Hardened LaTeX→PDF compilation in the cover letter generator and PDF converter by forbidding shell escapes, adding timeouts and proper cleanup for subprocess invocations, and documenting the security issue in the Sentinel log.

Sequence diagram for pdflatex compilation with timeout and no_shell_escape

sequenceDiagram
    participant CoverLetterGenerator
    participant PDFConverter
    participant subprocess
    participant pdflatex

    rect rgb(230,230,250)
        CoverLetterGenerator->>subprocess: Popen([pdflatex, -interaction=nonstopmode, -no-shell-escape, tex_path.name])
        activate subprocess
        subprocess->>pdflatex: start process in working_dir
        activate pdflatex
        subprocess->>pdflatex: communicate(timeout=30)
        pdflatex-->>subprocess: stdout, stderr or long-running
        alt completes_before_timeout
            subprocess-->>CoverLetterGenerator: returncode, stdout, stderr
        else TimeoutExpired
            subprocess->>pdflatex: kill
            pdflatex-->>subprocess: process terminated
            subprocess-->>CoverLetterGenerator: TimeoutExpired handled, stdout, stderr
        end
        deactivate pdflatex
        deactivate subprocess
    end

    rect rgb(240,255,240)
        PDFConverter->>subprocess: Popen([pdflatex, -interaction=nonstopmode, -no-shell-escape, tex_path.name])
        activate subprocess
        subprocess->>pdflatex: start process in working_dir
        activate pdflatex
        subprocess->>pdflatex: communicate(timeout=30)
        pdflatex-->>subprocess: stdout, stderr or long-running
        alt completes_before_timeout
            subprocess-->>PDFConverter: returncode, stdout, stderr
        else TimeoutExpired
            subprocess->>pdflatex: kill
            pdflatex-->>subprocess: process terminated
            subprocess-->>PDFConverter: TimeoutExpired handled, stdout, stderr
        end
        deactivate pdflatex
        deactivate subprocess
    end
Loading

Sequence diagram for pandoc compilation with timeout and no_shell_escape

sequenceDiagram
    participant CoverLetterGenerator
    participant PDFConverter
    participant subprocess
    participant pandoc

    rect rgb(230,230,250)
        CoverLetterGenerator->>subprocess: Popen([pandoc, tex_path, -o, output_path, --pdf-engine=xelatex, --pdf-engine-opt=-no-shell-escape])
        activate subprocess
        subprocess->>pandoc: start process
        activate pandoc
        subprocess->>pandoc: communicate(timeout=30)
        pandoc-->>subprocess: stdout, stderr or long-running
        alt completes_before_timeout
            subprocess-->>CoverLetterGenerator: returncode, stdout, stderr
        else TimeoutExpired
            subprocess->>pandoc: kill
            pandoc-->>subprocess: process terminated
            subprocess-->>CoverLetterGenerator: TimeoutExpired handled, stdout, stderr
        end
        deactivate pandoc
        deactivate subprocess
    end

    rect rgb(240,255,240)
        PDFConverter->>subprocess: Popen([pandoc, tex_path, -o, output_path, --pdf-engine=xelatex, --pdf-engine-opt=-no-shell-escape])
        activate subprocess
        subprocess->>pandoc: start process
        activate pandoc
        subprocess->>pandoc: communicate(timeout=30)
        pandoc-->>subprocess: stdout, stderr or long-running
        alt completes_before_timeout
            subprocess-->>PDFConverter: returncode, stdout, stderr
        else TimeoutExpired
            subprocess->>pandoc: kill
            pandoc-->>subprocess: process terminated
            subprocess-->>PDFConverter: TimeoutExpired handled, stdout, stderr
        end
        deactivate pandoc
        deactivate subprocess
    end
Loading

File-Level Changes

Change Details Files
Secure pdflatex invocation in cover letter PDF compilation.
  • Add -no-shell-escape flag to pdflatex command used for LaTeX compilation in the cover letter generator
  • Wrap process.communicate with a 30-second timeout and handle subprocess.TimeoutExpired by killing and re-communicating with the process
  • Preserve existing success condition based on return code or output file existence
cli/generators/cover_letter_generator.py
Secure pandoc-based PDF fallback in cover letter generation.
  • Extend pandoc invocation to pass --pdf-engine-opt=-no-shell-escape to the xelatex engine
  • Add 30-second timeout handling around process.communicate, including killing the process on timeout and re-collecting output
  • Keep existing logic for determining successful PDF creation
cli/generators/cover_letter_generator.py
Harden pdflatex and pandoc compilation paths in the shared PDF converter.
  • Add -no-shell-escape to pdflatex invocations in the PDF converter
  • Add --pdf-engine-opt=-no-shell-escape to pandoc invocations in the PDF converter
  • Introduce 30-second timeouts with TimeoutExpired handling and process.kill() for both pdflatex and pandoc subprocesses, while preserving existing success checks
cli/pdf/converter.py
Document the RCE and DoS vulnerability and mitigation steps in Sentinel log.
  • Append a new critical incident entry describing the missing shell-escape restrictions and timeouts
  • Capture learnings about default pdflatex options and the necessity of timeouts for subprocess calls
  • Record prevention guidance mandating -no-shell-escape flags and robust timeout patterns going forward
.jules/sentinel.md

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Copy Markdown

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • The timeout/TimeoutExpired handling around subprocess.Popen(...).communicate() is duplicated in multiple places; consider extracting a small helper (e.g., run_with_timeout(cmd, cwd, timeout=30)) to keep the pattern consistent and easier to maintain.
  • The hard-coded timeout=30 is used in several locations; defining a single module-level constant (e.g., PDF_COMPILE_TIMEOUT_SECONDS) would make it easier to tune this behavior and keep the values in sync.
  • On timeout you currently kill the process but ignore the failure outcome; depending on how this is surfaced to callers, you might want to explicitly treat timeouts as a failed compilation (e.g., via a distinct return value or error message) so consumers can react appropriately.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The timeout/TimeoutExpired handling around `subprocess.Popen(...).communicate()` is duplicated in multiple places; consider extracting a small helper (e.g., `run_with_timeout(cmd, cwd, timeout=30)`) to keep the pattern consistent and easier to maintain.
- The hard-coded `timeout=30` is used in several locations; defining a single module-level constant (e.g., `PDF_COMPILE_TIMEOUT_SECONDS`) would make it easier to tune this behavior and keep the values in sync.
- On timeout you currently kill the process but ignore the failure outcome; depending on how this is surfaced to callers, you might want to explicitly treat timeouts as a failed compilation (e.g., via a distinct return value or error message) so consumers can react appropriately.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click πŸ‘ or πŸ‘Ž on each comment and I'll use the feedback to improve your reviews.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant