The objective of this experiment is to evaluate the performance of a single Triggerflow instance to determine the maximum throughput (events/second) that a worker can process.
The experiment simulates an intense workflow and it performs a massive fork-join operation of 200 parallel tasks. Each task will produce 1000 events, so in total, the worker will process 200000 events. The worker must join and aggregate the events correctly without losing any event in the process.
In addition, the tests include a script for consuming the same quantity of events but without doing any fork-join operation, so the throughput can be calculated as a baseline to compare with the throughput that Triggerflow produces.
- The Triggerflow Client installed.
- A Triggerflow deployment up and running.
- An event broker instance available (Kafka, Redis, RabbitMQ...).
- A trigger storage instance available (Redis, CouchDB...).
- Create the workspace and trigger using this script for kafka, or this one for redis.
$ python3 /tests/ingestion-test/setup_kafka.py
- Then, produce all the events using the
produce_*.pyfunction from the same script. To have better performance and to avoid getting an incorrect result because of the maximum throughput a single publisher can produce, consider producing the 200000 events from multiple nodes in parallel. For example, to produce the events from two separate process, it would be:
$ python3 /tests/ingestion-test/produce_kafka.py 1 2
$ python3 /tests/ingestion-test/produce_kafka.py 2 2
Inside all the scripts there is a variable that determines the workspace name and event topic, so make sure the value is the same for both scripts.
-
Then, start monitoring the worker container logs to calculate the time spent to aggregate all the events.
-
To execute the no-operation scripts, run:
$ python3 /tests/ingestion-test/noop_kafka.py
The objective of this test is to prove that, when Triggerflow is deployed on scalable platforms and using event-based autoscaling policies (like with Kubernetes + KEDA or Knative Eventing), the resources are allocated only when it is needed, resulting in a serverless and pay-per-use execution model.
The experiment consists of deploying a bunch of workspaces and producing events that activate the triggers from those workspaces. Instead of producing all events at once, the events are generated in a pseudo-random way with pauses in between, so that there are instances of time where a worker processing a workspace is not receiving any events and it will be scheduled for deprovisioning. After some time, there will be events produced that reactivate the previously deprovisioned worker, so it will be provisioned again to continue processing events.
- Triggerflow deployed on KEDA or Knative Eventing.
- An event broker instance available (Kafka, Redis, RabbitMQ...).
- Deploy the triggers and workspace using this script. TODO
- Produce the events using this script. The script generates the events that activate the triggers that are inside the workspaces previously deployed.
$ python3 test/autoscaling_test.py
- Monitor the active workers. You should see how the workers are being provisioned and deprovisioned based on the events that they are processing.
$ kubectl get pods -o wide -w
The objective of this experiment is to calculate the overhead produced when orchestrating a serverless workflow using events and triggers for the DAGs interface.
- An IBM Cloud account.
- IBM Cloud CLI.
- A Cloud Foundry enabled IBM Cloud Functions namespace created.
- The Triggerflow Client module installed.
- A Triggerflow deployment up and running.
- The Triggerflow configuration file filled up for the DAGs and IBM Cloud
section and located in the
$HOMEdirectory. - An event broker instance available (Kafka, Redis, RabbitMQ...).
- A trigger storage instance available (Redis, CouchDB...).
- Set the desired sequence length/map size parameter in the DAG definition file. That's the
sequence_lengthvariable in/tests/dag-sequence/ibm_cf_sequence.pyvariable for sequence workflow andconcurrencyin/tests/dag-map/ibm_cf_parallel.pyvariable for parallel workflow. - Deploy the
sleepfunction to IBM Cloud:
$ ibmcloud wsk package create triggerflow-tests
$ ibmcloud wsk action create triggerflow-tests/sleep --docker triggerflow/ibm_cloud_functions_runtime-v38:latest tests/dag-map/action.py
- Build the DAGs:
$ triggerflow dag build tests/dag-sequence/ibm_cf_sequence.py
$ triggerflow dag build tests/dag-sequence/ibm_cf_map.py
- Run the DAGs:
$ triggerflow dag run sequence
$ triggerflow dag run map
- Monitor the worker's logs to determine the total execution time. Subtract from the total execution time the time spent by the functions executing to get the overhead.
The objective of this experiment is to calculate the overhead produced when orchestrating a serverless workflow using events and triggers for the workflow as code (event sourcing) interface.
- An IBM Cloud account.
- IBM Cloud CLI.
- A Cloud Foundry enabled IBM Cloud Functions namespace created.
- IBM PyWren installed and configured.
- The Triggerflow Client module installed.
- A Triggerflow deployment up and running.
- An event broker instance available (Kafka, Redis, RabbitMQ...).
- A trigger storage instance available (Redis, CouchDB...).
Find complete instrucions and examples here.
- The Triggerflow client installed and configured.
- The DAGs and IBM Cloud Functions section from the
triggerflow_config.yamlconfiguration file filled. - The Triggerflow backend (trigger API, service, database and event broker) deployed.
- An IBM Cloud account.
- IBM Cloud CLI installed and configured.
- An IBM Cloud Functions namespace created.
- A IBM Cloud Object Storage bucket created.
- Set up the environment variables. The following environment variables must be set:
BUCKET: The IBM COS bucket for storing the input, intermediante and result data.AWS_ACCESS_KEY_ID: The HMAC access key id COS service credential.AWS_SECRET_ACCESS_KEY: The HMAC secret access key COS service credential. To create an HMAC credentials for your bucket please follow these instructions.
-
Download the MDT files for free from here and place them in a directory named
mdtin this path. -
Execute the
upload_mdt_tiles.shscript. This script uploads the previously downloaded MDT files to your IBM COS bucket. -
Upload the
shapefile.zipfile needed by some tasks of the workflow. To upload the file to your IBM COS bucket, execute the following commands:
$ wget https://aitor-data.s3.amazonaws.com/public/shapefile.zip -O /tmp/shapefile.zip
$ ibmcloud cos upload --bucket $BUCKET --key shapefile.zip --file /tmp/shapefile.zip
$ rm /tmp/shapefile.zip
- Deploy the workflow's serverless functions. First, create a package that will contain all the workflow's functions:
$ ibmcloud wsk package create geospatial-dag
Then, execute the create_functions.sh script. This script creates the functions located in the functions directory.
- Build the DAG. Run the
dag.pyfile to compile and save the DAG.
$ python3 dag.py
- Once the DAG is built, run it by using the Triggerflow CLI:
$ triggerflow dag run water_consumption
- During the execution of the workflow, you can stop the worker that is processing the workflow to simulate a system failure. If the backend in deployed on Kubernetes using KEDA or Knative, you can kill the worker pod, and the system will automatically provision the pod again. If the backend is deployed on Docker or locally using processes, you will have to manually provision the worker container. In both cases, the worker should recover the state of the workflow by retrieving the triggers' context from the DB and uncommitted events from the event broker.
- The Triggerflow client installed and configured.
- The DAGs and IBM Cloud Functions section from the
triggerflow_config.yamlconfiguration file filled. - The Triggerflow backend (trigger API, service, database and event broker) deployed.
- An AWS account.
-
Create a AWS S3 bucket (for example, name it
montage). -
Create a lambda layer containing the necessary dependencies for the lambda functions:
- Run the script
create_layer.sh. This will create a zip file containing the dependencies. Upload this zip as a Layer.
- Run the script
-
Create subnets for Lambda in every availability zone.
- Create a subnet in every availability zone, using for example, the default VPC.
- Create a NAT gateway.
- Create a route table.
- Add the NAT gateway to the route table as the default gateway.
- Assign this route table to every subnet created before.
-
Create a EFS file system.
- Create a EFS in the desired VPC.
- Attach the EFS to the subnets created before.
- Create an access point for the EFS to use with lambda. Note that the mount point must be
/mnt/lambda
-
Compile the Montage binaries.
- Go to Montage Downloads and download the latest version source.
- You need to compile the binaries in a machine compatible with Lambda. For example, you can compile it with a EC2 instance running Amazon Linux 2.
- Copy the compiled binaries (
/bin) into the/bindirectory.
-
Create the lambda functions.
- Run the
create_lambda.shscript. This will create three zip files, each one is a different lambda function. - When creating the lambda function, assign it a role that has access to S3.
- Attach the Lambdas to the desired VPC within the previously created subnets.
- Attach the EFS volume created previously. Note that the mount point must be
/mnt/lambda - Add the following ENV vars to your lambdas:
- BUCKET: The name of the bucket created before.
- REGION: The region where the bucket is located.
- TARGET_DIR: The PWD of the lambdas, in this case set it as
/mnt/lambda
- Run the
-
Upload the uncompressed content of the
data.ziparchive to the bucket root. -
Replace the
Resourceattribute of themontage.jsonstate machine with your own lambda ARNs. -
Execute
run.py