Quickly setup 'SpaDES' project directories, get modules, and deal with a number of issues related to reproducibility and reusabililty. This package was designed with a PERFICT approach in mind (See McIntire et al. 2022).
Achieving PERFICT in your projects can be challenging, but it becomes more accessible with the use of SpaDES. The SpaDES.project tool is designed to streamline the setup of SpaDES projects. Complementing the R scripting language, there are three valuable tools at your disposal:
- Integrated Development Environment (IDE): This IDE encourages the organization of your work into projects, enhancing your workflow efficiency.
- Modular Coding Approach: Embrace a modular approach to coding, allowing for easier maintenance and collaboration.
- Version Control (Optional): For those interested in code development,
SpaDESsupports version control through platforms like GitHub, giving you control over project history and collaboration.
Our preferred toolset includes Posit (specifically, RStudio), SpaDES, and GitHub. While these are our recommendations, there are alternative options available. We acknowledge that not all users are familiar with Git, and that's perfectly acceptable. SpaDES.project has been designed to cater to all users, whether they choose to use a Git-controlled project or not.
Beyond these tools, our extensive experience in managing projects with diverse developers, operating systems, users, data sources, and packages has revealed various challenges. These challenges arise due to the nature of open, modular, and interoperable projects. As project complexity increases, typical reproducible workflows may falter. Issues include:
- Variations in
.Rprofilefiles. - Non-transferable file paths.
- Incompatibilities between packages on different operating systems.
- Conflicts between package versions.
- Problems with the order of package loading and installation (e.g., inability to install a different version of a package while it's already loaded).
- Spaghetti code, where objects are defined in one file and used in another.
- Differences in users' familiarity with GitHub.
- The presence of cryptic code and objects ("just run that line, don't worry about what it does").
- Objects defined by a user lingering in the
.GlobalEnv, leading to undetected issues. - Varying competencies among different users.
Given these complexities, it's not enough to create a "reproducible" script; it must be a "reusable" script that functions flawlessly on any machine, operating system, and for any user.
Users can certainly attempt to address these issues individually, but we've developed SpaDES.project as a solution. It's derived from our most intricate projects to date, yet it's designed with beginners in mind. We've anticipated these challenges so that users won't encounter them unexpectedly during their project journeys.
The package website is the best place to start. We would suggest these vignettes, roughly in this order:
Package website: https://spades-project.predictiveecology.org
The wider SpaDES ecosystem: https://SpaDES.PredictiveEcology.org
Wiki: https://github.com/PredictiveEcology/SpaDES/wiki
Install from CRAN:
# Not yet on CRAN
# install.packages("SpaDES.project")install.packages("SpaDES.project", repos = c("predictiveecology.r-universe.dev", getOption("repos")))setupProject(paths = list(projectPath = tempdir()),
modules = c("PredictiveEcology/Biomass_borealDataPrep@development",
"PredictiveEcology/Biomass_core@development"))The following example contains everything needed to run a set of modules, in a very short set of commands, from (almost) any starting condition. We use this approach in many of our projects.
Key features:
- The project is fully self-contained in a folder, with packages installed to a unique library based on the
projectPath. This isolation is deliberate: it is what stops the package collisions that derailed many of our early projects. - No call to
install.packageswithout anifguard, and nolibraryorrequirecalls before the package installations. - Every function is "rerun-capable" (i.e., idempotent): rerunning a line does not redo work that is already done. For example,
remotes::install_githubdoes not reinstall a package whose SHA has not changed. - No objects are assigned into the
.GlobalEnv. For small, simple projects, using the.GlobalEnvis fine; as a project grows, an object with a common name likeoutcan be picked up unexpectedly, hiding an error that should instead have caused the function to fail. - The minimum number of packages are installed before
setupProject. AsSpaDES.projectandRequireupdates reach CRAN, this will get simpler still. - Minimal use of
setwd. The example below usessetwd("~")only as a bare-minimum default; change it (or comment it out) to wherever you want this project to live.
getOrUpdatePkg <- function(p, minVer = "0") {
if (!isFALSE(try(packageVersion(p) < minVer, silent = TRUE) )) {
repo <- c("predictiveecology.r-universe.dev", getOption("repos"))
install.packages(p, repos = repo)
}
}
getOrUpdatePkg("remotes")
getOrUpdatePkg("Require", "1.0.0")
getOrUpdatePkg("SpaDES.project", "1.0.1")
setwd("~") # change this to wherever you want the project to live
out <- SpaDES.project::setupProject(
runName = "Example",
paths = list(projectPath = "integratingSpaDESmodules",
modulePath = "SpaDES_Modules",
outputPath = file.path("outputs", runName)),
modules = c("tati-micheletti/speciesAbundance@main",
"tati-micheletti/temperature@main",
"tati-micheletti/speciesAbundTempLM@main"),
times = list(start = 2013,
end = 2032),
updateRprofile = TRUE,
Restart = TRUE
)
snippsim <- do.call(SpaDES.core::simInitAndSpades, out)
Please see CONTRIBUTING.md for information on how to contribute to this project.
