diff --git a/README.md b/README.md index 76e95e01..39197f85 100644 --- a/README.md +++ b/README.md @@ -6,52 +6,64 @@ [![Codecov][cov-badge]][cov-link]  [![Downloads](https://static.pepy.tech/personalized-badge/wfcommons?period=total&units=international_system&left_color=grey&right_color=yellowgreen&left_text=Downloads)](https://pepy.tech/project/wfcommons) - -
_A Framework for Enabling Scientific Workflow Research and Development_ + +
An open-source ecosystem of workflow execution instances, synthetic workflow generators, and benchmark specifications. It helps the community study scheduling, performance, resilience, and emerging AI-driven workflow automation on modern distributed and HPC platforms. -This Python package provides a collection of tools for: +- **Real instances:** Workflow executions curated in a common JSON format (WfFormat). +- **Synthetic realism:** Generate realistic workflows from real traces. +- **Benchmarks:** Produce executable specs for repeatable experiments and fair comparisons. -- Analyzing instances of actual workflow executions; -- Producing recipes structures for creating workflow recipes for workflow generation; -- Generating synthetic realistic workflow instances; and -- Generating realistic workflow benchmark specifications. +Quick links: [Documentation](https://wfcommons.readthedocs.io/en/latest/) · [Website](https://wfcommons.org) · [GitHub Issues](https://github.com/wfcommons/wfcommons/issues) -[![Open In Gitpod](https://gitpod.io/button/open-in-gitpod.svg)](https://gitpod.io/#https://github.com/wfcommons/wfcommons/tree/main) +## Quickstart -## Installation +WfCommons requires Python 3.11+ and has been tested on Linux and macOS. -WfCommons is available on [PyPI](https://pypi.org/project/wfcommons). -WfCommons requires Python3.11+ and has been tested on Linux and MacOS. +``` +python3 -m venv .venv +source .venv/bin/activate +python3 -m pip install wfcommons +``` -### Installation using pip +Generate a synthetic workflow in a few lines: -While `pip` can be used to install WfCommons, we suggest the following -approach for reliable installation when many Python environments are available: +```python +import pathlib +from wfcommons.wfchef.recipes import SeismologyRecipe +from wfcommons import WorkflowGenerator -``` -$ python3 -m pip install wfcommons +generator = WorkflowGenerator(SeismologyRecipe.from_num_tasks(250)) +workflow = generator.build_workflow() +workflow.write_json(pathlib.Path("seismology-workflow.json")) ``` -### Retrieving the latest unstable version +Next steps: -If you want to use the latest WfCommons unstable version, that will contain -brand new features (but also contain bugs as the stabilization work is still -underway), you may consider retrieving the latest unstable version. +- Learn how to build recipes in the [WfChef guide](https://wfcommons.readthedocs.io/en/latest/generating_workflows_recipe.html). +- Generate larger workflow families in the [WfGen guide](https://wfcommons.readthedocs.io/en/latest/generating_workflows.html). +- Produce benchmark specs in the [WfBench guide](https://wfcommons.readthedocs.io/en/latest/generating_workflow_benchmarks.html). -Cloning from [WfCommons's GitHub](https://github.com/wfcommons/wfcommons) -repository: +## Installation + +WfCommons is available on [PyPI](https://pypi.org/project/wfcommons). + +``` +python3 -m pip install wfcommons +``` + +### Installing from source (latest) ``` -$ git clone https://github.com/wfcommons/wfcommons -$ cd wfcommons -$ pip install . +git clone https://github.com/wfcommons/wfcommons +cd wfcommons +python3 -m pip install . ``` ### Optional Requirements #### Graphviz -WfCommons uses _pygraphviz_ for generating visualizations for the workflow task graph. -If you want to enable this feature, you will have to install the +WfCommons uses _pygraphviz_ for generating visualizations for the workflow task graph. +If you want to enable this feature, you will have to install the [graphviz](https://www.graphviz.org/) package (version 2.16 or later). You can install graphviz easily on Linux with your favorite package manager, for example for Debian-based distributions: @@ -82,15 +94,14 @@ python3 -m pip install pydot ## Get in Touch -The main channel to reach the WfCommons team is via the support email: +The main channel to reach the WfCommons team is via the support email: [support@wfcommons.org](mailto:support@wfcommons.org). -**Bug Report / Feature Request:** our preferred channel to report a bug or request a feature is via +**Bug Report / Feature Request:** our preferred channel to report a bug or request a feature is via WfCommons's [Github Issues Track](https://github.com/wfcommons/wfcommons/issues). - ## Citing WfCommons -When citing WfCommons, please use the following paper. You should also actually read +When citing WfCommons, please use the following paper. You should also actually read that paper, as it provides a recent and general overview on the framework. ``` diff --git a/docs/requirements.txt b/docs/requirements.txt index 1d4e15a1..6dca0f5a 100644 --- a/docs/requirements.txt +++ b/docs/requirements.txt @@ -1,4 +1,5 @@ sphinx>=5.3.0 +furo>=2024.5.6 sphinx_rtd_theme>=1.2.0 recommonmark>=0.7.1 jsonschema~=3.2.0 @@ -12,4 +13,4 @@ setuptools>=49.3.1 pyyaml>=5.3.1 pandas>=1.2.4 stringcase>=1.2.0 -filelock>=3.6.0 \ No newline at end of file +filelock>=3.6.0 diff --git a/docs/source/_static/css/custom.css b/docs/source/_static/css/custom.css new file mode 100644 index 00000000..7efe5639 --- /dev/null +++ b/docs/source/_static/css/custom.css @@ -0,0 +1,248 @@ +:root { + --content-max-width: 72rem; + --sidebar-width: 18rem; + --color-background-primary: #f7f6f2; + --color-background-secondary: #f0efe9; + --color-sidebar-background: #f0efe9; + --color-background-hover: #e9ecf2; + --color-background-border: #e1e4ee; + --color-brand-primary: #1f5eff; + --color-brand-content: #1a55e6; + --color-foreground-primary: #16181d; + --color-foreground-secondary: #4b5563; + --accent-warm: #ff7a59; + --accent-mint: #22c997; + --accent-sky: #4cc9f0; + --accent-violet: #9b5de5; + color-scheme: light; +} + +article h1 { + letter-spacing: -0.01em; +} + +article h2 { + letter-spacing: -0.005em; +} + +body { + background: + radial-gradient(1200px 600px at 10% -10%, rgba(155, 93, 229, 0.12), transparent 60%), + radial-gradient(900px 500px at 90% -15%, rgba(76, 201, 240, 0.15), transparent 55%), + radial-gradient(700px 400px at 50% 110%, rgba(34, 201, 151, 0.12), transparent 60%), + var(--color-background-primary); +} + +.sidebar-drawer, +.sidebar-container, +.toc-drawer, +.page-sidebar, +.sidebar-tree { + background: var(--color-sidebar-background); +} + +.content { + background: transparent; +} + +/* Ensure any theme variant still uses the light palette. */ +html[data-theme="dark"], +html[data-theme="auto"], +body[data-theme="dark"], +body[data-theme="auto"] { + color-scheme: light; +} + +html[data-theme="dark"], +html[data-theme="auto"], +body[data-theme="dark"], +body[data-theme="auto"] { + --color-background-primary: #f7f6f2; + --color-background-secondary: #f0efe9; + --color-sidebar-background: #f0efe9; + --color-background-hover: #e9ecf2; + --color-background-border: #e1e4ee; + --color-brand-primary: #1f5eff; + --color-brand-content: #1a55e6; + --color-foreground-primary: #16181d; + --color-foreground-secondary: #4b5563; + --accent-warm: #ff7a59; + --accent-mint: #22c997; + --accent-sky: #4cc9f0; + --accent-violet: #9b5de5; +} + +body[data-theme="dark"], +body[data-theme="auto"] { + background: + radial-gradient(1200px 600px at 10% -10%, rgba(155, 93, 229, 0.12), transparent 60%), + radial-gradient(900px 500px at 90% -15%, rgba(76, 201, 240, 0.15), transparent 55%), + radial-gradient(700px 400px at 50% 110%, rgba(34, 201, 151, 0.12), transparent 60%), + var(--color-background-primary); +} + +body[data-theme="dark"] .sidebar-drawer, +body[data-theme="auto"] .sidebar-drawer, +body[data-theme="dark"] .sidebar-container, +body[data-theme="auto"] .sidebar-container, +body[data-theme="dark"] .toc-drawer, +body[data-theme="auto"] .toc-drawer, +body[data-theme="dark"] .page-sidebar, +body[data-theme="auto"] .page-sidebar, +body[data-theme="dark"] .sidebar-tree, +body[data-theme="auto"] .sidebar-tree, +body[data-theme="dark"] .mobile-header, +body[data-theme="auto"] .mobile-header { + background: var(--color-sidebar-background); +} + +/* Never show dark-only elements. */ +.only-dark { + display: none !important; +} + +article { + background: + linear-gradient(180deg, rgba(255, 255, 255, 0.85), rgba(255, 255, 255, 0.92)); + border: 1px solid var(--color-background-border); + border-radius: 20px; + box-shadow: 0 24px 48px rgba(19, 25, 39, 0.08); + padding: 2.25rem 2.5rem; +} + +article h1 { + color: #0f172a; +} + +article h1::after { + content: ""; + display: block; + width: 72px; + height: 6px; + margin-top: 0.5rem; + border-radius: 999px; + background: linear-gradient(90deg, var(--accent-warm), var(--accent-sky), var(--accent-mint)); +} + +article h2 { + color: #1e293b; +} + +article h3 { + color: #1f2937; +} + +article a { + color: var(--color-brand-primary); + font-weight: 600; +} + +article a:hover { + color: var(--accent-violet); +} + +code, +pre { + background: #eef2ff; + border-color: #d8def2; +} + +pre { + box-shadow: inset 0 0 0 1px rgba(26, 85, 230, 0.08); +} + +.sidebar-tree .current > a { + background: rgba(26, 85, 230, 0.08); + border-radius: 8px; +} + +.sidebar-tree a:hover { + background: rgba(34, 201, 151, 0.12); + border-radius: 8px; +} + +.toc-drawer .toc-tree li > a, +.toc-tree li > a { + border-radius: 8px; +} + +.toc-tree li > a:hover { + background: rgba(76, 201, 240, 0.14); +} + +table { + border-radius: 12px; + overflow: hidden; + box-shadow: 0 12px 24px rgba(15, 23, 42, 0.08); +} + +th { + background: rgba(31, 94, 255, 0.08); +} + +hr { + border: none; + height: 1px; + background: linear-gradient(90deg, transparent, rgba(31, 94, 255, 0.3), transparent); +} + +/* Hide heading link glyphs while keeping the anchor element. */ +.headerlink { + color: transparent !important; + text-decoration: none !important; + font-size: 0 !important; + width: 0; + margin-left: 0; +} + +.headerlink::before { + content: ""; +} + +/* Force light palette even if the theme requests dark mode. */ +html[data-theme="dark"], +html[data-theme="dark"] body, +html[data-theme="auto"], +html[data-theme="auto"] body { + --color-background-primary: #f7f6f2; + --color-background-secondary: #f0efe9; + --color-sidebar-background: #f0efe9; + --color-background-hover: #e9ecf2; + --color-background-border: #e1e4ee; + --color-brand-primary: #1f5eff; + --color-brand-content: #1a55e6; + --color-foreground-primary: #16181d; + --color-foreground-secondary: #4b5563; + --accent-warm: #ff7a59; + --accent-mint: #22c997; + --accent-sky: #4cc9f0; + --accent-violet: #9b5de5; + color-scheme: light; +} + +html[data-theme="dark"] body, +html[data-theme="auto"] body { + background: + radial-gradient(1200px 600px at 10% -10%, rgba(155, 93, 229, 0.12), transparent 60%), + radial-gradient(900px 500px at 90% -15%, rgba(76, 201, 240, 0.15), transparent 55%), + radial-gradient(700px 400px at 50% 110%, rgba(34, 201, 151, 0.12), transparent 60%), + var(--color-background-primary); +} + +/* Hide the theme toggle UI entirely. */ +.theme-toggle, +.theme-toggle-container, +.theme-toggle-container button { + display: none !important; +} + +/* Hide project title text in the sidebar, keep the logo visible. */ +.sidebar-brand-text, +.sidebar-brand .brand-text, +.sidebar-brand .title { + display: none !important; +} + +article .align-left { + float: none !important +} diff --git a/docs/source/analyzing_instances.rst b/docs/source/analyzing_instances.rst index 49acdd91..64ec4179 100644 --- a/docs/source/analyzing_instances.rst +++ b/docs/source/analyzing_instances.rst @@ -65,6 +65,32 @@ parser provides a :meth:`~wfcommons.wfinstances.logs.abstract_logs_parser.LogsParser.build_workflow` method. +Supported log parsers ++++++++++++++++++++++ + +- :class:`~wfcommons.wfinstances.logs.pegasusrec.HierarchicalPegasusLogsParser` +- :class:`~wfcommons.wfinstances.logs.makeflow.MakeflowLogsParser` +- :class:`~wfcommons.wfinstances.logs.nextflow.NextflowLogsParser` +- :class:`~wfcommons.wfinstances.logs.pegasus.PegasusLogsParser` +- :class:`~wfcommons.wfinstances.logs.taskvine.TaskVineLogsParser` + +Examples +++++++++ + +Hierarchical Pegasus +++++++++++++++++++++ + +This parser targets Pegasus submit directories that contain hierarchical workflows. +It recursively parses sub-workflows and rebuilds a coherent workflow instance:: + + import pathlib + from wfcommons.wfinstances import HierarchicalPegasusLogsParser + + submit_dir = pathlib.Path('/path/to/pegasus/hierarchical/submit/dir/') + parser = HierarchicalPegasusLogsParser(submit_dir=submit_dir) + workflow = parser.build_workflow('pegasus-hierarchical-workflow-test') + workflow.write_json(pathlib.Path('./pegasus-hierarchical-workflow.json')) + Makeflow ++++++++ @@ -166,6 +192,21 @@ class: :: workflow_path = pathlib.Path('./pegasus-workflow.json') workflow.write_json(workflow_path) +TaskVine +++++++++ + +`TaskVine `_ is a task scheduler for +data-intensive dynamic workflows. The TaskVine logs parser translates TaskVine +execution logs into workflow instances compatible with :ref:`json-format-label`:: + + import pathlib + from wfcommons.wfinstances import TaskVineLogsParser + + execution_dir = pathlib.Path('/path/to/taskvine/execution/dir/') + parser = TaskVineLogsParser(execution_dir=execution_dir) + workflow = parser.build_workflow('taskvine-workflow-test') + workflow.write_json(pathlib.Path('./taskvine-workflow.json')) + The Instance Analyzer --------------------- diff --git a/docs/source/conf.py b/docs/source/conf.py index 48359a88..6f121783 100644 --- a/docs/source/conf.py +++ b/docs/source/conf.py @@ -12,9 +12,7 @@ # import os.path -import sphinx_rtd_theme import sys - # Fetch the version exec(open('../../wfcommons/version.py').read()) @@ -63,10 +61,13 @@ # The theme to use for HTML and HTML Help pages. See the documentation for # a list of builtin themes. # -html_theme = 'sphinx_rtd_theme' +html_theme = 'furo' html_favicon = 'favicon.png' +html_logo = 'images/wfcommons-horizontal.png' +html_title = 'WfCommons' # Add any paths that contain custom static files (such as style sheets) here, # relative to this directory. They are copied after the builtin static files, # so a file named "default.css" will overwrite the builtin "default.css". html_static_path = ['_static'] +html_css_files = ['css/custom.css'] diff --git a/docs/source/dev_api_wfbench.rst b/docs/source/dev_api_wfbench.rst index ef73feff..f50002db 100644 --- a/docs/source/dev_api_wfbench.rst +++ b/docs/source/dev_api_wfbench.rst @@ -14,6 +14,56 @@ wfcommons.wfbench.bench :private-members: :noindex: +wfcommons.wfbench.translator.airflow +------------------------------------ + +.. automodule:: wfcommons.wfbench.translator.airflow + :members: + :undoc-members: + :show-inheritance: + :private-members: + :noindex: + +wfcommons.wfbench.translator.bash +-------------------------------- + +.. automodule:: wfcommons.wfbench.translator.bash + :members: + :undoc-members: + :show-inheritance: + :private-members: + :noindex: + +wfcommons.wfbench.translator.cwl +------------------------------- + +.. automodule:: wfcommons.wfbench.translator.cwl + :members: + :undoc-members: + :show-inheritance: + :private-members: + :noindex: + +wfcommons.wfbench.translator.dask +-------------------------------- + +.. automodule:: wfcommons.wfbench.translator.dask + :members: + :undoc-members: + :show-inheritance: + :private-members: + :noindex: + +wfcommons.wfbench.translator.makeflow +------------------------------------ + +.. automodule:: wfcommons.wfbench.translator.makeflow + :members: + :undoc-members: + :show-inheritance: + :private-members: + :noindex: + wfcommons.wfbench.translator.nextflow ------------------------------------- @@ -24,6 +74,16 @@ wfcommons.wfbench.translator.nextflow :private-members: :noindex: +wfcommons.wfbench.translator.parsl +--------------------------------- + +.. automodule:: wfcommons.wfbench.translator.parsl + :members: + :undoc-members: + :show-inheritance: + :private-members: + :noindex: + wfcommons.wfbench.translator.pegasus ------------------------------------ @@ -34,6 +94,16 @@ wfcommons.wfbench.translator.pegasus :private-members: :noindex: +wfcommons.wfbench.translator.pycompss +------------------------------------ + +.. automodule:: wfcommons.wfbench.translator.pycompss + :members: + :undoc-members: + :show-inheritance: + :private-members: + :noindex: + wfcommons.wfbench.translator.swift_t ------------------------------------ diff --git a/docs/source/favicon.png b/docs/source/favicon.png index c9633490..27ad1f51 100644 Binary files a/docs/source/favicon.png and b/docs/source/favicon.png differ diff --git a/docs/source/generating_workflow_benchmarks.rst b/docs/source/generating_workflow_benchmarks.rst index a160f806..a980e2e0 100644 --- a/docs/source/generating_workflow_benchmarks.rst +++ b/docs/source/generating_workflow_benchmarks.rst @@ -3,16 +3,14 @@ WfBench: Workflow Benchmarks ============================ -**WfBench** is a generator of realistic workflow benchmark specifications that -can be translated into benchmark code to be executed with current workflow -systems. it generates workflow tasks with arbitrary performance characteristics -(CPU, memory, and I/O usage) and with realistic task dependency structures -based on those seen in production workflows. +**WfBench** generates realistic workflow benchmark specifications that can be +translated into runnable benchmarks for current workflow systems. It produces +tasks with tunable performance characteristics (CPU, memory, and I/O usage) +and realistic dependency structures derived from production workflows. -The generation of workflow benchmakrs is twofold. First, a realistic workflow -benchmark specification is generated in the :ref:`json-format-label`. Then, -this specification is translated into benchmark code to be executed with a -workflow system. +Benchmark generation is twofold: first, a specification is produced in the +:ref:`json-format-label`; then, that specification is translated into +executable benchmark code for a target workflow system. Generating Workflow Benchmark Specifications -------------------------------------------- @@ -70,68 +68,130 @@ The generated benchmark will have exactly the same structure as the synthetic wo This is useful when you want to generate a benchmark with a specific structure or when you want benchmarks with the more detailed structure provided by WfChef workflow generation. -Translating Specifications into Benchmark Codes ------------------------------------------------ +Translating Specifications into Benchmark Code +---------------------------------------------- + +WfCommons provides a collection of translators that turn benchmark specifications +into runnable workflow code. All translators inherit from +:class:`~wfcommons.wfbench.translator.abstract_translator.Translator` and accept +either a :class:`~wfcommons.common.workflow.Workflow` object or a path to a +benchmark specification in :ref:`json-format-label`. + +Supported translators (alphabetical) +++++++++++++++++++++++++++++++++++++ + +- Airflow +- Bash +- CWL +- Dask +- Makeflow +- Nextflow +- Parsl +- Pegasus +- PyCOMPSs +- Swift/T +- TaskVine -WfCommons provides a collection of translators for executing the benchmarks as actual -workflow applications. Below, we provide illustrative examples on how to generate -workflow benchmarks for the currently supported workflow systems. +.. warning:: -The :class:`~wfcommons.wfbench.translator.abstract_translator.Translator` class is -the foundation for each translator class. This class takes as input either a -:class:`~wfcommons.common.workflow.Workflow` object or a path to a workflow benchmark -description in :ref:`json-format-label`. + WfBench leverages :code:`stress-ng` (https://github.com/ColinIanKing/stress-ng) + to execute memory-intensive threads. Ensure :code:`stress-ng` is installed on + all worker nodes. -.. warning:: - - WfBench leverages :code:`stress-ng` (https://github.com/ColinIanKing/stress-ng) - to execute memory-intensive threads. Therefore, it is crucial to ensure that - :code:`stress-ng` is installed on all worker nodes. +Airflow ++++++++ + +`Apache Airflow `_ is a platform for authoring, +scheduling, and monitoring workflows as code. Use the Airflow translator to +produce DAGs that can be executed by an Airflow scheduler:: + + import pathlib + from wfcommons import BlastRecipe + from wfcommons.wfbench import WorkflowBenchmark, AirflowTranslator + + benchmark = WorkflowBenchmark(recipe=BlastRecipe, num_tasks=200) + benchmark.create_benchmark(pathlib.Path("/tmp/"), cpu_work=100, data=10, percent_cpu=0.6) + + translator = AirflowTranslator(benchmark.workflow) + translator.translate(output_folder=pathlib.Path("./airflow-wf/")) + +Bash +++++ + +The Bash translator generates a simple, runnable shell workflow for quick local +validation and debugging:: + + import pathlib + from wfcommons import BlastRecipe + from wfcommons.wfbench import WorkflowBenchmark, BashTranslator + + benchmark = WorkflowBenchmark(recipe=BlastRecipe, num_tasks=100) + benchmark.create_benchmark(pathlib.Path("/tmp/"), cpu_work=50, data=5, percent_cpu=0.7) + + translator = BashTranslator(benchmark.workflow) + translator.translate(output_folder=pathlib.Path("./bash-wf/")) +CWL ++++ + +`CWL `_ is a community standard for describing command-line +tools and workflows. The CWL translator emits portable CWL definitions:: + + import pathlib + from wfcommons import BlastRecipe + from wfcommons.wfbench import WorkflowBenchmark, CWLTranslator + + benchmark = WorkflowBenchmark(recipe=BlastRecipe, num_tasks=150) + benchmark.create_benchmark(pathlib.Path("/tmp/"), cpu_work=80, data=8, percent_cpu=0.6) + + translator = CWLTranslator(benchmark.workflow) + translator.translate(output_folder=pathlib.Path("./cwl-wf/")) Dask -++++++++ +++++ + `Dask `_ is an open-source library for parallel computing -in Python. It makes it possible to easily implement and execute workflows local machines, HPC cluster schedulers, and cloud-based -and container-based environments. Below, we provide an example on how to generate -workflow benchmark for running with Dask:: +in Python. It supports local execution, HPC schedulers, and cloud environments:: import pathlib - from wfcommons import BlastRecipe from wfcommons.wfbench import WorkflowBenchmark, DaskTranslator - # create a workflow benchmark object to generate specifications based on a recipe benchmark = WorkflowBenchmark(recipe=BlastRecipe, num_tasks=500) - - # generate a specification based on performance characteristics benchmark.create_benchmark(pathlib.Path("/tmp/"), cpu_work=100, data=10, percent_cpu=0.6) - # generate a Dask workflow translator = DaskTranslator(benchmark.workflow) - translator.translate(output_folder=pathlib.Path("./dask-wf/"")) + translator.translate(output_folder=pathlib.Path("./dask-wf/")) -Nextflow +Makeflow ++++++++ -`Nextflow `_ is a workflow management system that enables -the development of portable and reproducible workflows. It supports deploying workflows -on a variety of execution platforms including local, HPC schedulers, and cloud-based -and container-based environments. Below, we provide an example on how to generate -workflow benchmark for running with Nextflow:: +`Makeflow `_ targets large, DAG-shaped +workflows on clusters, grids, and clouds. The translator emits Makeflow workflows:: import pathlib + from wfcommons import BlastRecipe + from wfcommons.wfbench import WorkflowBenchmark, MakeflowTranslator + + benchmark = WorkflowBenchmark(recipe=BlastRecipe, num_tasks=200) + benchmark.create_benchmark(pathlib.Path("/tmp/"), cpu_work=100, data=10, percent_cpu=0.6) + translator = MakeflowTranslator(benchmark.workflow) + translator.translate(output_folder=pathlib.Path("./makeflow-wf/")) + +Nextflow +++++++++ + +`Nextflow `_ enables portable, reproducible workflows +across local, HPC, and cloud environments:: + + import pathlib from wfcommons import BlastRecipe from wfcommons.wfbench import WorkflowBenchmark, NextflowTranslator - # create a workflow benchmark object to generate specifications based on a recipe benchmark = WorkflowBenchmark(recipe=BlastRecipe, num_tasks=500) - - # generate a specification based on performance characteristics benchmark.create_benchmark(pathlib.Path("/tmp/"), cpu_work=100, data=10, percent_cpu=0.6) - # generate a Nextflow workflow translator = NextflowTranslator( benchmark.workflow, use_subworkflows=False, @@ -153,137 +213,113 @@ the modules sequentially:: .. warning:: - Nextflow's way of defining workflows does not support tasks with iterations i.e. tasks - that depend on another instance of the same abstract task. Thus, the translator - fails when you try to translate a workflow with iterations. + Nextflow does not support tasks with iterations (tasks that depend on another + instance of the same abstract task). Translation fails for workflows that + include iterations. .. note:: - + If you plan to run Nextflow on an HPC system using Slurm, we **strongly - recommend** using the `HyperQueue `_ - executor. HyperQueue efficiently distributes workflow tasks across all allocated + recommend** using the `HyperQueue `_ + executor. HyperQueue efficiently distributes workflow tasks across all allocated compute nodes, improving scalability and resource utilization. The :class:`~wfcommons.wfbench.translator.nextflow.NextflowTranslator` - class includes functionality to automatically generate a Slurm script + class includes functionality to automatically generate a Slurm script template for running the workflow on HPC systems. +Parsl ++++++ + +`Parsl `_ is a parallel scripting library for Python. +The translator emits a Parsl workflow suitable for local or distributed execution:: + + import pathlib + from wfcommons import BlastRecipe + from wfcommons.wfbench import WorkflowBenchmark, ParslTranslator + + benchmark = WorkflowBenchmark(recipe=BlastRecipe, num_tasks=200) + benchmark.create_benchmark(pathlib.Path("/tmp/"), cpu_work=100, data=10, percent_cpu=0.6) + + translator = ParslTranslator(benchmark.workflow) + translator.translate(output_folder=pathlib.Path("./parsl-wf/")) + Pegasus +++++++ -`Pegasus `_ orchestrates the execution of complex scientific -workflows by providing a platform to define, organize, and automate computational -tasks and data dependencies. Pegasus handles the complexity of large-scale workflows -by automatically mapping tasks onto distributed computing resources, such as clusters, -grids, or clouds. Below, we provide an example on how to generate workflow benchmark -for running with Pegasus:: +`Pegasus `_ orchestrates complex scientific workflows on +clusters, grids, and clouds by mapping tasks onto distributed resources:: import pathlib - from wfcommons import BlastRecipe from wfcommons.wfbench import WorkflowBenchmark, PegasusTranslator - # create a workflow benchmark object to generate specifications based on a recipe benchmark = WorkflowBenchmark(recipe=BlastRecipe, num_tasks=500) - - # generate a specification based on performance characteristics benchmark.create_benchmark(pathlib.Path("/tmp/"), cpu_work=100, data=10, percent_cpu=0.6) - # generate a Pegasus workflow translator = PegasusTranslator(benchmark.workflow) translator.translate(output_folder=pathlib.Path("./pegasus-wf/")) .. warning:: - Pegasus utilizes the `HTCondor `_ framework to orchestrate - the execution of workflow tasks. By default, HTCondor does not implement CPU affinity - for program threads. However, WfBench offers an extra capability to enforce CPU - affinity during benchmark execution. To enable this feature, you need to specify - the :code:`lock_files_folder` parameter when using + Pegasus uses `HTCondor `_ to orchestrate tasks. By + default, HTCondor does not implement CPU affinity for program threads. + To enable CPU affinity, specify :code:`lock_files_folder` when using :meth:`~wfcommons.wfbench.bench.WorkflowBenchmark.create_benchmark`. PyCOMPSs ++++++++ -`PyCOMPSs `_ is a programming model and runtime that -enables the parallel execution of Python applications on distributed computing -infrastructures. It allows developers to define tasks using simple Python -decorators, automatically handling task scheduling, data dependencies, and -resource management.. Below, we provide an example on how to generate workflow -benchmark for running with PyCOMPSs:: +`PyCOMPSs `_ is a programming model and runtime for +parallel Python applications on distributed infrastructures:: import pathlib - from wfcommons import CyclesRecipe from wfcommons.wfbench import WorkflowBenchmark, PyCompssTranslator - # create a workflow benchmark object to generate specifications based on a recipe benchmark = WorkflowBenchmark(recipe=CyclesRecipe, num_tasks=200) - - # generate a specification based on performance characteristics benchmark.create_benchmark(pathlib.Path("/tmp/"), cpu_work=500, data=1000, percent_cpu=0.8) - # generate a PyCOMPSs workflow translator = PyCompssTranslator(benchmark.workflow) translator.translate(output_folder=pathlib.Path("./pycompss-wf/")) Swift/T +++++++ -`Swift/T `_ is an advanced workflow system designed -specifically for high-performance computing (HPC) environments. It dynamically manages -task dependencies and resource allocation, enabling efficient utilization of HPC -systems. It provides a seamless interface to diverse tools, libraries, and scientific -applications, making it easy to integrate existing codes into workflows. Below, we -provide an example on how to generate workflow benchmark for running with Swift/T:: +`Swift/T `_ is a workflow system for HPC environments, +designed to scale to large task graphs:: import pathlib - from wfcommons import BlastRecipe from wfcommons.wfbench import WorkflowBenchmark, SwiftTTranslator - # create a workflow benchmark object to generate specifications based on a recipe benchmark = WorkflowBenchmark(recipe=BlastRecipe, num_tasks=500) - - # generate a specification based on performance characteristics benchmark.create_benchmark(pathlib.Path("/tmp/"), cpu_work=100, data=10, percent_cpu=1.0) - # generate a Swift/T workflow translator = SwiftTTranslator(benchmark.workflow) translator.translate(output_folder=pathlib.Path("./swift-t-wf/")) TaskVine ++++++++ -`TaskVine `_ is a task scheduler for -building large scale data intensive dynamic workflows that run on HPC clusters, -GPU clusters, and commercial clouds. As tasks access external data sources and -produce their own outputs, more and more data is pulled into local storage on -workers. This data is used to accelerate future tasks and avoid re-computing -exisiting results. Data gradually grows "like a vine" through the cluster. -Below, we provide an example on how to generate workflow benchmark for running -with TaskVine:: +`TaskVine `_ is a task scheduler for +data-intensive dynamic workflows across HPC clusters, GPU clusters, and clouds:: import pathlib - from wfcommons import BlastRecipe from wfcommons.wfbench import WorkflowBenchmark, TaskVineTranslator - # create a workflow benchmark object to generate specifications based on a recipe benchmark = WorkflowBenchmark(recipe=BlastRecipe, num_tasks=500) - - # generate a specification based on performance characteristics benchmark.create_benchmark(save_dir=pathlib.Path("/tmp/"), cpu_work=100, data=10, percent_cpu=1.0) - # generate a TaskVine workflow translator = TaskVineTranslator(benchmark.workflow) translator.translate(output_folder=pathlib.Path("./taskvine-wf/")) -In the example above, WfBench will generate a folder which will contain the -TaskVine workflow :code:`taskvine_workflow.py`, the workflow input data -(:code:`./taskvine-wf/data/`), the workflow binary files (:code:`./taskvine-wf/bin/`), -and the Poncho package specification (:code:`./taskvine-wf/taskvine_poncho.json`). +WfBench will generate a folder containing the TaskVine workflow +:code:`taskvine_workflow.py`, workflow input data (:code:`./taskvine-wf/data/`), +workflow binaries (:code:`./taskvine-wf/bin/`), and the Poncho package specification +(:code:`./taskvine-wf/taskvine_poncho.json`). .. warning:: - This TaskVine workflow requires :code:`stress-ng` to be installed and accessible + This TaskVine workflow requires :code:`stress-ng` to be installed and accessible in the system's :code:`$PATH` where the manager runs. diff --git a/docs/source/images/wfcommons-horizontal.png b/docs/source/images/wfcommons-horizontal.png index d7f2b49a..2d36a6fb 100644 Binary files a/docs/source/images/wfcommons-horizontal.png and b/docs/source/images/wfcommons-horizontal.png differ diff --git a/docs/source/index.rst b/docs/source/index.rst index 17234991..b503fd49 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -1,13 +1,19 @@ .. figure:: images/wfcommons-horizontal.png - :scale: 35 % + :scale: 15 % + :align: left |pypi-badge| |build-badge| |license-badge| -`WfCommons `__ is an open-source framework for -enabling scientific workflow research and development. This Python package -provides methods for analyzing instances, deriving recipes, generating -representative synthetic workflow instances, and generating realistic -workflow benchmark specifications. +`WfCommons `__ is an open-source Python framework for +enabling scientific workflow research and development. Use it to analyze +workflow execution instances, build workflow recipes, generate realistic +synthetic workflows, and create benchmark specifications. + +Quick links: `Documentation `__ · +`Website `__ · +`GitHub `__ + +More information about the project can be found at: https://wfcommons.org/ .. figure:: images/wfcommons.png :scale: 70 % @@ -15,6 +21,23 @@ workflow benchmark specifications. The WfCommons conceptual architecture. +What you can do +=============== + +- Analyze workflow instances from real executions. +- Derive reusable recipes that capture workflow structure and performance. +- Generate synthetic workflows at scale for experimentation. +- Build benchmark specs for multiple workflow systems. + +Get started +=========== + +- :doc:`quickstart_installation` for install and first steps. +- :doc:`introduction` for a project overview and WfFormat context. +- :doc:`generating_workflows_recipe` to build recipes from real instances. +- :doc:`generating_workflows` to generate synthetic workflows. +- :doc:`generating_workflow_benchmarks` to produce benchmark specs. + ---- Support @@ -30,14 +53,19 @@ support@wfcommons.org. ---- +Contents +======== + .. toctree:: :caption: Quickstart + :hidden: :maxdepth: 2 quickstart_installation.rst .. toctree:: :caption: User Guide + :hidden: :maxdepth: 2 introduction.rst @@ -48,6 +76,7 @@ support@wfcommons.org. .. toctree:: :caption: API Reference + :hidden: :maxdepth: 1 user_api_reference.rst diff --git a/docs/source/quickstart_installation.rst b/docs/source/quickstart_installation.rst index a8fc8b4a..594a2723 100644 --- a/docs/source/quickstart_installation.rst +++ b/docs/source/quickstart_installation.rst @@ -4,29 +4,43 @@ Installation WfCommons is available on `PyPI `_. WfCommons requires Python3.11+ and has been tested on Linux and macOS. -Installation using pip +Recommended setup ---------------------- -While :code:`pip` can be used to install WfCommons, we suggest the following -approach for reliable installation when many Python environments are available: +We recommend installing into a virtual environment to keep dependencies isolated: .. code-block:: bash + $ python3 -m venv .venv + $ source .venv/bin/activate $ python3 -m pip install wfcommons -Retrieving the latest unstable version --------------------------------------- +Verify installation +------------------- -If you want to use the latest WfCommons unstable version, that will contain -brand new features (but also contain bugs as the stabilization work is still -underway), you may consider retrieving the latest unstable version. +You can confirm the CLI entry point is available: -Cloning from `WfCommons `_'s GitHub -repository: :: +.. code-block:: bash + + $ wfchef --help + +Or check the Python import: + +.. code-block:: bash + + $ python3 -c "import wfcommons; print(wfcommons.__version__)" + +Installing from source (latest) +------------------------------- + +If you want the latest development version (potentially unstable), clone the +repository and install locally: + +.. code-block:: bash $ git clone https://github.com/wfcommons/wfcommons $ cd wfcommons - $ pip install . + $ python3 -m pip install . Optional Requirements --------------------- diff --git a/docs/source/user_api_wfbench.rst b/docs/source/user_api_wfbench.rst index 1f529656..55f926d5 100644 --- a/docs/source/user_api_wfbench.rst +++ b/docs/source/user_api_wfbench.rst @@ -14,6 +14,46 @@ wfcommons.wfbench.bench :undoc-members: :show-inheritance: +wfcommons.wfbench.translator.airflow +------------------------------------ + +.. automodule:: wfcommons.wfbench.translator.airflow + :members: + :undoc-members: + :show-inheritance: + +wfcommons.wfbench.translator.bash +--------------------------------- + +.. automodule:: wfcommons.wfbench.translator.bash + :members: + :undoc-members: + :show-inheritance: + +wfcommons.wfbench.translator.cwl +-------------------------------- + +.. automodule:: wfcommons.wfbench.translator.cwl + :members: + :undoc-members: + :show-inheritance: + +wfcommons.wfbench.translator.dask +--------------------------------- + +.. automodule:: wfcommons.wfbench.translator.dask + :members: + :undoc-members: + :show-inheritance: + +wfcommons.wfbench.translator.makeflow +------------------------------------- + +.. automodule:: wfcommons.wfbench.translator.makeflow + :members: + :undoc-members: + :show-inheritance: + wfcommons.wfbench.translator.nextflow ------------------------------------- @@ -22,6 +62,14 @@ wfcommons.wfbench.translator.nextflow :undoc-members: :show-inheritance: +wfcommons.wfbench.translator.parsl +---------------------------------- + +.. automodule:: wfcommons.wfbench.translator.parsl + :members: + :undoc-members: + :show-inheritance: + wfcommons.wfbench.translator.pegasus ------------------------------------ @@ -30,6 +78,14 @@ wfcommons.wfbench.translator.pegasus :undoc-members: :show-inheritance: +wfcommons.wfbench.translator.pycompss +------------------------------------- + +.. automodule:: wfcommons.wfbench.translator.pycompss + :members: + :undoc-members: + :show-inheritance: + wfcommons.wfbench.translator.swift_t ------------------------------------