-
Notifications
You must be signed in to change notification settings - Fork 6
feat: Speed up R builds #67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
📝 WalkthroughWalkthroughTwo Dockerfiles updated to change how R packages are installed. ir/ir-base/Dockerfile.ir now installs IRkernel from the Posit Package Manager CRAN mirror and calls IRkernel::installspec(). ir/ir-with-libs/Dockerfile.ir replaces many per-package R install commands with environment variables CRAN_REPO and R_PACKAGES and a single R -e invocation to install all packages, and consolidates apt-get into a single update/install/clean sequence while removing explicit per-package installation blocks. 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@ir/ir-with-libs/Dockerfile.ir`:
- Around line 66-78: The Dockerfile removes Java runtime when purging
${BUILDDEPS}, breaking rJava at runtime; ensure a JRE package remains by
installing default-jre-headless separately and not removing it: add
default-jre-headless to the apt-get install step (outside of ${BUILDDEPS}) or
install it in the same RUN before the apt-get remove --purge -y ${BUILDDEPS}
line, and keep the apt-get remove call targeting only ${BUILDDEPS} so
default-jre-headless (needed by rJava) is preserved.
| RUN apt-get update \ | ||
| && apt-get install -y ${BUILDDEPS} \ | ||
| && R -e " \ | ||
| options(Ncpus = 4); \ | ||
| install.packages( \ | ||
| strsplit(Sys.getenv('R_PACKAGES'), '\\\\s+')[[1]], \ | ||
| repos = Sys.getenv('CRAN_REPO'), \ | ||
| dependencies = TRUE \ | ||
| ) \ | ||
| " \ | ||
| && apt-get clean \ | ||
| && apt-get remove --purge -y ${BUILDDEPS} \ | ||
| && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# Find and read the Dockerfile
find . -name "Dockerfile.ir" -path "*ir-with-libs*" | head -5Repository: deepnote/environments
Length of output: 96
🏁 Script executed:
# Read the entire Dockerfile to understand the structure
cat "ir/ir-with-libs/Dockerfile.ir"Repository: deepnote/environments
Length of output: 1544
rJava will fail at runtime after BUILDDEPS is purged
rJava requires a Java runtime, but default-jdk (the only Java package installed) gets removed. Install default-jre-headless separately and only purge build dependencies.
🔧 Suggested fix
+ENV RUNTIME_DEPS="default-jre-headless"
RUN apt-get update \
- && apt-get install -y ${BUILDDEPS} \
+ && apt-get install -y ${RUNTIME_DEPS} ${BUILDDEPS} \
&& R -e " \
options(Ncpus = 4); \
install.packages( \
strsplit(Sys.getenv('R_PACKAGES'), '\\\\s+')[[1]], \
repos = Sys.getenv('CRAN_REPO'), \
dependencies = TRUE \
) \
" \
&& apt-get clean \
&& apt-get remove --purge -y ${BUILDDEPS} \
&& rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| RUN apt-get update \ | |
| && apt-get install -y ${BUILDDEPS} \ | |
| && R -e " \ | |
| options(Ncpus = 4); \ | |
| install.packages( \ | |
| strsplit(Sys.getenv('R_PACKAGES'), '\\\\s+')[[1]], \ | |
| repos = Sys.getenv('CRAN_REPO'), \ | |
| dependencies = TRUE \ | |
| ) \ | |
| " \ | |
| && apt-get clean \ | |
| && apt-get remove --purge -y ${BUILDDEPS} \ | |
| && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* | |
| ENV RUNTIME_DEPS="default-jre-headless" | |
| RUN apt-get update \ | |
| && apt-get install -y ${RUNTIME_DEPS} ${BUILDDEPS} \ | |
| && R -e " \ | |
| options(Ncpus = 4); \ | |
| install.packages( \ | |
| strsplit(Sys.getenv('R_PACKAGES'), '\\\\s+')[[1]], \ | |
| repos = Sys.getenv('CRAN_REPO'), \ | |
| dependencies = TRUE \ | |
| ) \ | |
| " \ | |
| && apt-get clean \ | |
| && apt-get remove --purge -y ${BUILDDEPS} \ | |
| && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* |
🧰 Tools
🪛 Checkov (3.2.334)
[low] 1-78: Ensure that HEALTHCHECK instructions have been added to container images
(CKV_DOCKER_2)
[low] 1-78: Ensure that a user for the container has been created
(CKV_DOCKER_3)
🤖 Prompt for AI Agents
In `@ir/ir-with-libs/Dockerfile.ir` around lines 66 - 78, The Dockerfile removes
Java runtime when purging ${BUILDDEPS}, breaking rJava at runtime; ensure a JRE
package remains by installing default-jre-headless separately and not removing
it: add default-jre-headless to the apt-get install step (outside of
${BUILDDEPS}) or install it in the same RUN before the apt-get remove --purge -y
${BUILDDEPS} line, and keep the apt-get remove call targeting only ${BUILDDEPS}
so default-jre-headless (needed by rJava) is preserved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this bug must have been there forever as I didn't change the build logic, only refactored the RUN directives a bit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check the git history of the file to see the previous version
git log --oneline -n 10 -- ir/ir-with-libs/Dockerfile.ir
# Show the version before this PR
git show HEAD:ir/ir-with-libs/Dockerfile.ir | grep -A 20 "apt-get remove"
Length of output: 487
@m1so You're right. The git history confirms apt-get remove --purge -y ${BUILDDEPS} was already there before your changes. Your refactoring consolidated the R package installation but didn't alter the cleanup logic.
Worth fixing now that it's surfaced though. Do you want to address it in this PR or track separately?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@ir/ir-with-libs/Dockerfile.ir`:
- Around line 50-52: The CRAN repo is set to a moving "latest" snapshot via the
ENV CRAN_REPO variable; replace that value with a dated Posit Package Manager
snapshot URL to pin binaries for reproducible builds (update the ENV CRAN_REPO
entry in Dockerfile.ir to a specific snapshot date, e.g. use the Posit snapshot
path with a YYYY-MM-DD segment instead of "latest"). Ensure the ENV CRAN_REPO
assignment is updated wherever it appears so builds use the fixed dated
snapshot.
♻️ Duplicate comments (1)
ir/ir-with-libs/Dockerfile.ir (1)
66-81: rJava will miss a runtime Java after purge.
default-jdkis removed with${BUILDDEPS}, leaving no JRE forrJava. Keep a runtime JRE outside the purge.🔧 Minimal fix
+ENV RUNTIME_DEPS="default-jre-headless" RUN apt-get update \ - && apt-get install -y ${BUILDDEPS} \ + && apt-get install -y ${RUNTIME_DEPS} ${BUILDDEPS} \ && R -e " \ options( \ Ncpus = 4, \ HTTPUserAgent = sprintf('R/%s R (%s)', getRversion(), paste(getRversion(), R.version['platform'], R.version['arch'], R.version['os'])) \ ); \ install.packages( \ strsplit(Sys.getenv('R_PACKAGES'), '\\\\s+')[[1]], \ repos = Sys.getenv('CRAN_REPO'), \ dependencies = TRUE \ ) \ " \ && apt-get clean \ && apt-get remove --purge -y ${BUILDDEPS} \ && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
| # Use Posit Package Manager for precompiled Linux binaries | ||
| ENV CRAN_REPO="https://packagemanager.posit.co/cran/__linux__/bullseye/latest" | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick | 🔵 Trivial
Pin CRAN snapshot for reproducible builds.
Using latest can shift binaries over time. Prefer a dated snapshot URL to keep images deterministic.
♻️ Example update
-ENV CRAN_REPO="https://packagemanager.posit.co/cran/__linux__/bullseye/latest"
+ENV CRAN_REPO="https://packagemanager.posit.co/cran/__linux__/bullseye/2026-01-23"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Use Posit Package Manager for precompiled Linux binaries | |
| ENV CRAN_REPO="https://packagemanager.posit.co/cran/__linux__/bullseye/latest" | |
| # Use Posit Package Manager for precompiled Linux binaries | |
| ENV CRAN_REPO="https://packagemanager.posit.co/cran/__linux__/bullseye/2026-01-23" | |
🤖 Prompt for AI Agents
In `@ir/ir-with-libs/Dockerfile.ir` around lines 50 - 52, The CRAN repo is set to
a moving "latest" snapshot via the ENV CRAN_REPO variable; replace that value
with a dated Posit Package Manager snapshot URL to pin binaries for reproducible
builds (update the ENV CRAN_REPO entry in Dockerfile.ir to a specific snapshot
date, e.g. use the Posit snapshot path with a YYYY-MM-DD segment instead of
"latest"). Ensure the ENV CRAN_REPO assignment is updated wherever it appears so
builds use the fixed dated snapshot.
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.