π‘οΈ Sentinel: [CRITICAL] Fix RCE and DoS vulnerabilities in PDF compilation#212
π‘οΈ Sentinel: [CRITICAL] Fix RCE and DoS vulnerabilities in PDF compilation#212
Conversation
Co-authored-by: anchapin <6326294+anchapin@users.noreply.github.com>
|
π Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a π emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
Reviewer's GuideHardened PDF compilation against RCE and DoS by disabling LaTeX shell escapes for both pdflatex and pandoc, and adding timeouts plus error handling for longβrunning or hanging subprocesses, with the changes documented in the security log. Sequence diagram for secure PDF compilation in CoverLetterGeneratorsequenceDiagram
participant CoverLetterGenerator
participant Subprocess
participant Pdflatex
participant Pandoc
participant FileSystem
CoverLetterGenerator->>Subprocess: Popen [pdflatex -interaction=nonstopmode -no-shell-escape tex_path]
activate Subprocess
Subprocess->>Pdflatex: start compilation
Pdflatex-->>Subprocess: running
Subprocess-->>CoverLetterGenerator: communicate(timeout=30)
alt pdflatex finishes in time
Pdflatex-->>Subprocess: exit with returncode
Subprocess-->>CoverLetterGenerator: stdout, stderr
CoverLetterGenerator->>FileSystem: check output_path.exists
alt success or output_path exists
FileSystem-->>CoverLetterGenerator: pdf exists
CoverLetterGenerator-->>CoverLetterGenerator: pdf_created = True
else failure
FileSystem-->>CoverLetterGenerator: pdf missing
CoverLetterGenerator-->>CoverLetterGenerator: raise CalledProcessError or handle nonzero
end
else pdflatex hangs or infinite loop
Subprocess-->>CoverLetterGenerator: TimeoutExpired
CoverLetterGenerator->>Subprocess: kill
Subprocess-->>CoverLetterGenerator: communicate() cleanup
CoverLetterGenerator-->>CoverLetterGenerator: raise RuntimeError
end
Note over CoverLetterGenerator: Exceptions include CalledProcessError, FileNotFoundError, RuntimeError
alt first attempt fails or times out
CoverLetterGenerator->>Subprocess: Popen [pandoc ... --pdf-engine=xelatex --pdf-engine-opt=-no-shell-escape]
activate Subprocess
Subprocess->>Pandoc: start compilation
Pandoc-->>Subprocess: running
Subprocess-->>CoverLetterGenerator: communicate(timeout=30)
alt pandoc finishes in time
Pandoc-->>Subprocess: exit with returncode
Subprocess-->>CoverLetterGenerator: stdout, stderr
CoverLetterGenerator->>FileSystem: check output_path.exists
alt success or output_path exists
FileSystem-->>CoverLetterGenerator: pdf exists
CoverLetterGenerator-->>CoverLetterGenerator: pdf_created = True
else failure
FileSystem-->>CoverLetterGenerator: pdf missing
CoverLetterGenerator-->>CoverLetterGenerator: pdf_created remains False
end
else pandoc hangs or infinite loop
Subprocess-->>CoverLetterGenerator: TimeoutExpired
CoverLetterGenerator->>Subprocess: kill
Subprocess-->>CoverLetterGenerator: communicate() cleanup
CoverLetterGenerator-->>CoverLetterGenerator: raise RuntimeError
end
else first attempt succeeded
CoverLetterGenerator-->>CoverLetterGenerator: skip pandoc fallback
end
Sequence diagram for secure PDF compilation helpers in convertersequenceDiagram
participant PdfConverter
participant Subprocess
participant Pdflatex
participant Pandoc
participant FileSystem
rect rgb(230, 230, 255)
PdfConverter->>Subprocess: Popen [pdflatex -interaction=nonstopmode -no-shell-escape tex_path]
activate Subprocess
Subprocess->>Pdflatex: start compilation
Subprocess-->>PdfConverter: communicate(timeout=30)
alt pdflatex finishes in time
Pdflatex-->>Subprocess: exit with returncode
Subprocess-->>PdfConverter: stdout, stderr
PdfConverter->>FileSystem: check output_path.exists
alt success or output_path exists
FileSystem-->>PdfConverter: pdf exists
PdfConverter-->>PdfConverter: return True
else failure
FileSystem-->>PdfConverter: pdf missing
PdfConverter-->>PdfConverter: fall through to False
end
else pdflatex hangs or infinite loop
Subprocess-->>PdfConverter: TimeoutExpired
PdfConverter->>Subprocess: kill
Subprocess-->>PdfConverter: communicate() cleanup
PdfConverter-->>PdfConverter: raise RuntimeError
end
PdfConverter-->>PdfConverter: catch CalledProcessError, FileNotFoundError, RuntimeError
PdfConverter->>FileSystem: check output_path.exists
alt pdf exists despite error
FileSystem-->>PdfConverter: pdf exists
PdfConverter-->>PdfConverter: return True
else no pdf
FileSystem-->>PdfConverter: pdf missing
PdfConverter-->>PdfConverter: return False
end
end
rect rgb(230, 255, 230)
PdfConverter->>Subprocess: Popen [pandoc ... --pdf-engine=xelatex --pdf-engine-opt=-no-shell-escape]
activate Subprocess
Subprocess->>Pandoc: start compilation
Subprocess-->>PdfConverter: communicate(timeout=30)
alt pandoc finishes in time
Pandoc-->>Subprocess: exit with returncode
Subprocess-->>PdfConverter: stdout, stderr
PdfConverter->>FileSystem: check output_path.exists
alt success or output_path exists
FileSystem-->>PdfConverter: pdf exists
PdfConverter-->>PdfConverter: return True
else failure
FileSystem-->>PdfConverter: pdf missing
PdfConverter-->>PdfConverter: return False
end
else pandoc hangs or infinite loop
Subprocess-->>PdfConverter: TimeoutExpired
PdfConverter->>Subprocess: kill
Subprocess-->>PdfConverter: communicate() cleanup
PdfConverter-->>PdfConverter: raise RuntimeError
end
PdfConverter-->>PdfConverter: catch CalledProcessError, FileNotFoundError, RuntimeError
PdfConverter-->>PdfConverter: return False
end
Flow diagram for subprocess timeout and cleanup logic in PDF compilationflowchart TD
A[Start PDF compilation subprocess] --> B[Call process.communicate with timeout 30]
B --> C{Did communicate finish before timeout}
C -->|Yes| D[Read stdout and stderr]
D --> E[Check process.returncode and output_path.exists]
E --> F{PDF created or returncode 0}
F -->|Yes| G[Set pdf_created True or return True]
F -->|No| H[Handle failure and possibly try fallback]
C -->|No, TimeoutExpired| I[Catch TimeoutExpired]
I --> J[Call process.kill]
J --> K[Call process.communicate again for cleanup]
K --> L[Raise RuntimeError PDF compilation timed out]
L --> M[Catch RuntimeError in caller]
M --> N[Mark attempt failed and optionally check for fallback or existing PDF]
N --> O[End]
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- The timeout/kill/communicate pattern is duplicated in all four subprocess calls; consider extracting a small helper (e.g.,
run_with_timeout(cmd, cwd=None, timeout=30)) to keep the behavior consistent and reduce maintenance overhead. - Catching
RuntimeErroralongsideCalledProcessErrorandFileNotFoundErrormakes it less clear that this path is specifically handling timeouts; a dedicated exception type (or a more descriptive name) for the timeout case would make the control flow easier to understand and safer if otherRuntimeErrors are introduced later.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The timeout/kill/communicate pattern is duplicated in all four subprocess calls; consider extracting a small helper (e.g., `run_with_timeout(cmd, cwd=None, timeout=30)`) to keep the behavior consistent and reduce maintenance overhead.
- Catching `RuntimeError` alongside `CalledProcessError` and `FileNotFoundError` makes it less clear that this path is specifically handling timeouts; a dedicated exception type (or a more descriptive name) for the timeout case would make the control flow easier to understand and safer if other `RuntimeError`s are introduced later.
## Individual Comments
### Comment 1
<location path="cli/generators/cover_letter_generator.py" line_range="787" />
<code_context>
if process.returncode == 0 or output_path.exists():
pdf_created = True
- except (subprocess.CalledProcessError, FileNotFoundError):
+ except (subprocess.CalledProcessError, FileNotFoundError, RuntimeError):
# Check if PDF was created anyway
if output_path.exists():
</code_context>
<issue_to_address>
**suggestion:** The `CalledProcessError` in this `except` is never raised by the current usage of `subprocess.Popen`.
Because this code uses `Popen` + `communicate()` rather than helpers like `check_call` or `run(check=True)`, `CalledProcessError` is never raised here. Please remove it from the `except` tuple so the handled failure modes reflect what can actually occur.
```suggestion
except (FileNotFoundError, RuntimeError):
```
</issue_to_address>Help me be more useful! Please click π or π on each comment and I'll use the feedback to improve your reviews.
| if process.returncode == 0 or output_path.exists(): | ||
| pdf_created = True | ||
| except (subprocess.CalledProcessError, FileNotFoundError): | ||
| except (subprocess.CalledProcessError, FileNotFoundError, RuntimeError): |
There was a problem hiding this comment.
suggestion: The CalledProcessError in this except is never raised by the current usage of subprocess.Popen.
Because this code uses Popen + communicate() rather than helpers like check_call or run(check=True), CalledProcessError is never raised here. Please remove it from the except tuple so the handled failure modes reflect what can actually occur.
| except (subprocess.CalledProcessError, FileNotFoundError, RuntimeError): | |
| except (FileNotFoundError, RuntimeError): |
π¨ Severity: CRITICAL
π‘ Vulnerability: Remote Code Execution (RCE) via
\write18shell escape in PDF compilation and Denial of Service (DoS) via infinite loops. The-no-shell-escapeflag and atimeoutwere missing when executingpdflatexorpandocdirectly.π― Impact: If an attacker managed to inject malicious LaTeX code into the generated templates (e.g., via AI hallucination, intentional template overrides, or flaws in escaping logic), they could execute arbitrary shell commands on the server running the compilation. Runaway compilation could also exhaust server resources.
π§ Fix: Added the
-no-shell-escapeflag to bothpdflatexandpandoc(--pdf-engine-opt=-no-shell-escape) commands incli/pdf/converter.pyandcli/generators/cover_letter_generator.py. Added a 30-second timeout toprocess.communicate()calls and implemented proper cleanup forsubprocess.TimeoutExpiredexceptions.β Verification: Verified by reading the modified files, running the test suite (
pytest tests/test_pdf_security.pyandpytestfor the whole suite) and receiving code review approval. The new vulnerability finding has been recorded in.jules/sentinel.md.PR created automatically by Jules for task 15689215248428004751 started by @anchapin
Summary by Sourcery
Harden PDF compilation against remote code execution and denial-of-service by restricting LaTeX shell escapes and enforcing timeouts on external tools.
Bug Fixes:
Documentation: