diff --git a/.agent/skills/adding-new-metadata/SKILL.md b/.agent/skills/adding-new-metadata/SKILL.md index 72b061f1df82..fd050d8c7fdb 100644 --- a/.agent/skills/adding-new-metadata/SKILL.md +++ b/.agent/skills/adding-new-metadata/SKILL.md @@ -88,7 +88,6 @@ You must ensure that when a DoFn processes an element and outputs a new element, ### Timers If metadata needs to survive timer firings (e.g., knowing an `@OnTimer` fired because of a system drain), it must be added to Timer data structures. This is a bit of uncharted area which was only implemented for CausedByDrain metadata that comes from backend, not from persisted metadata. In order to persist all WindowedValue metadata across timer, more work has to be done, below are some pointers: * `runners/core-java/src/main/java/org/apache/beam/runners/core/TimerInternals.java` and implementations (e.g., `WindmillTimerInternals.java` in Dataflow). -* `runners/samza/src/test/java/org/apache/beam/runners/samza/runtime/KeyedTimerData.java` (or generic `TimerData`). * **Action:** Add the field to `TimerData`, next to `CausedByDrain`. Propagate it when setting the timer and expose it when the timer fires so it bubbles up. * Eventually, metadata from Timer lands in WindowedValue, so it can be exposed to users. Keep field names, types, and getters similar to WindowedValue as much as possible, as common interface may be introduced eventually. @@ -116,4 +115,4 @@ User needs to access the metadata in their `DoFn` (e.g., `@ProcessElement public 9. [ ] Update `ReduceFnRunner` and `OutputAndTimeBoundedSplittableProcessElementInvoker` for complex transform propagation. 10. [ ] If required by timers, update `TimerData` and `TimerInternals`. 11. [ ] If exposed to the user, update `DoFnSignatures` and `ByteBuddyDoFnInvokerFactory`. -12. [ ] Update other runners (Flink, Spark, Samza) to ensure they propagate the new `WindowedValue` fields correctly in their specific operators/runners. +12. [ ] Update other runners (Flink, Spark) to ensure they propagate the new `WindowedValue` fields correctly in their specific operators/runners. diff --git a/.agent/skills/runners/SKILL.md b/.agent/skills/runners/SKILL.md index f92943ab097c..b5e25f9898ee 100644 --- a/.agent/skills/runners/SKILL.md +++ b/.agent/skills/runners/SKILL.md @@ -34,7 +34,6 @@ Runners execute Beam pipelines on distributed processing backends. Each runner t | Dataflow | `runners/google-cloud-dataflow-java/` | Google Cloud Dataflow | | Flink | `runners/flink/` | Apache Flink | | Spark | `runners/spark/` | Apache Spark | -| Samza | `runners/samza/` | Apache Samza | | Jet | `runners/jet/` | Hazelcast Jet | | Twister2 | `runners/twister2/` | Twister2 | diff --git a/.test-infra/metrics/sync/github/github_runs_prefetcher/code/config.yaml b/.test-infra/metrics/sync/github/github_runs_prefetcher/code/config.yaml index 3b4038bae64c..154bd1fe8884 100644 --- a/.test-infra/metrics/sync/github/github_runs_prefetcher/code/config.yaml +++ b/.test-infra/metrics/sync/github/github_runs_prefetcher/code/config.yaml @@ -76,7 +76,6 @@ categories: - "PostCommit Python ValidatesRunner Dataflow" - "PostCommit Python ValidatesRunner Spark" - "PostCommit Python ValidatesRunner Flink" - - "PostCommit Python ValidatesRunner Samza" - "Build python source distribution and wheels" - "Java Tests" - "PostCommit Java" @@ -113,7 +112,6 @@ categories: - "PreCommit Java Kafka IO Direct" - "PostCommit Java Examples Direct" - "PreCommit Java JDBC IO Direct" - - "PostCommit Java ValidatesRunner Samza" - "PreCommit Java Mqtt IO Direct" - "PreCommit Java Kinesis IO Direct" - "PreCommit Java MongoDb IO Direct" @@ -138,7 +136,6 @@ categories: - "PreCommit Java Thrift IO Direct" - "PreCommit Java Snowflake IO Direct" - "PreCommit Java Solr IO Direct" - - "PostCommit Java PVR Samza" - "PreCommit Java Tika IO Direct" - "PostCommit Java SingleStoreIO IT" - "PostCommit Java ValidatesRunner Direct" @@ -209,7 +206,6 @@ categories: - "PerformanceTests BigQueryIO Batch Java Json" - "PerformanceTests SQLBigQueryIO Batch Java" - "PerformanceTests XmlIOIT" - - "PostCommit XVR Samza" - "PerformanceTests ManyFiles TextIOIT" - "PerformanceTests XmlIOIT HDFS" - "PerformanceTests ParquetIOIT" @@ -291,8 +287,7 @@ categories: tests: - "PerformanceTests MongoDBIO IT" - "PreCommit GoPortable" - - "PreCommit GoPrism" - - "PostCommit Go VR Samza" + - "PreCommit GoPrism" - "PostCommit Go Dataflow ARM" - "LoadTests Go CoGBK Dataflow Batch" - "LoadTests Go Combine Dataflow Batch" diff --git a/learning/tour-of-beam/learning-content/introduction/introduction-concepts/runner-concepts/description.md b/learning/tour-of-beam/learning-content/introduction/introduction-concepts/runner-concepts/description.md index 31743b29b8cd..2ee57e07f990 100644 --- a/learning/tour-of-beam/learning-content/introduction/introduction-concepts/runner-concepts/description.md +++ b/learning/tour-of-beam/learning-content/introduction/introduction-concepts/runner-concepts/description.md @@ -390,103 +390,6 @@ python -m apache_beam.examples.wordcount --input /path/to/inputfile \ ``` {{end}} -{{if (eq .Sdk "java" "go")}} -### Samza runner - -The Apache Samza Runner can be used to execute Beam pipelines using Apache Samza. The Samza Runner executes Beam pipeline in a Samza application and can run locally. The application can further be built into a .tgz file, and deployed to a YARN cluster or Samza standalone cluster with Zookeeper. - -The Samza Runner and Samza are suitable for large scale, stateful streaming jobs, and provide: - -* First class support for local state (with RocksDB store). This allows fast state access for high frequency streaming jobs. -* Fault-tolerance with support for incremental checkpointing of state instead of full snapshots. This enables Samza to scale to applications with very large state. -* A fully asynchronous processing engine that makes remote calls efficient. -* Flexible deployment model for running the applications in any hosting environment with Zookeeper. -* Features like canaries, upgrades and rollbacks that support extremely large deployments with minimal downtime. - -Additionally, you can read more about the Samza Runner [here](https://beam.apache.org/documentation/runners/samza/) - -#### Run example -{{end}} - -{{if (eq .Sdk "go")}} - -Need import: -``` -"github.com/apache/beam/sdks/v2/go/pkg/beam/runners/samza" -``` - -It is necessary to give an endpoint where the runner is raised with `--endpoint`: -``` -$ go install github.com/apache/beam/sdks/v2/go/examples/wordcount -# As part of the initial setup, for non linux users - install package unix before run -$ go get -u golang.org/x/sys/unix -$ wordcount --input gs://dataflow-samples/shakespeare/kinglear.txt \ ---output gs:///counts \ ---runner samza \ ---project your-gcp-project \ ---region your-gcp-region \ ---temp_location gs:///tmp/ \ ---staging_location gs:///binaries/ \ ---worker_harness_container_image=apache/beam_go_sdk:latest \ ---endpoint=localhost:8081 -``` -{{end}} - -{{if (eq .Sdk "java")}} -You can specify your dependency on the Samza Runner by adding the following to your `pom.xml`: - -``` - - org.apache.beam - beam-runners-samza - 2.42.0 - runtime - - - - - org.apache.samza - samza-api - ${samza.version} - - - - org.apache.samza - samza-core_2.11 - ${samza.version} - - - - org.apache.samza - samza-kafka_2.11 - ${samza.version} - runtime - - - - org.apache.samza - samza-kv_2.11 - ${samza.version} - runtime - - - - org.apache.samza - samza-kv-rocksdb_2.11 - ${samza.version} - runtime - -``` - -Console: -``` -$ mvn exec:java -Dexec.mainClass=org.apache.beam.examples.WordCount \ - -Psamza-runner \ - -Dexec.args="--runner=SamzaRunner \ - --inputFile=/path/to/input \ - --output=/path/to/counts" -``` - ### Nemo runner The Apache Nemo Runner can be used to execute Beam pipelines using Apache Nemo. The Nemo Runner can optimize Beam pipelines with the Nemo compiler through various optimization passes and execute them in a distributed fashion using the Nemo runtime. You can also deploy a self-contained application for local mode or run using resource managers like YARN or Mesos. diff --git a/release/src/main/scripts/jenkins_jobs.txt b/release/src/main/scripts/jenkins_jobs.txt deleted file mode 100644 index ae4c13b24d4c..000000000000 --- a/release/src/main/scripts/jenkins_jobs.txt +++ /dev/null @@ -1,115 +0,0 @@ -Run Chicago Taxi on Dataflow,beam_PostCommit_Python_Chicago_Taxi_Dataflow_PR -Run Chicago Taxi on Flink,beam_PostCommit_Python_Chicago_Taxi_Flink_PR -Run Dataflow Runner Nexmark Tests,beam_PostCommit_Java_Nexmark_Dataflow_PR -Run Dataflow Runner Tpcds Tests,beam_PostCommit_Java_Tpcds_Dataflow_PR -Run Dataflow Runner V2 Java 11 Nexmark Tests,beam_PostCommit_Java_Nexmark_DataflowV2_Java11_PR -Run Dataflow Runner V2 Java 17 Nexmark Tests,beam_PostCommit_Java_Nexmark_DataflowV2_Java17_PR -Run Dataflow Runner V2 Nexmark Tests,beam_PostCommit_Java_Nexmark_DataflowV2_PR -Run Dataflow Streaming ValidatesRunner,beam_PostCommit_Java_ValidatesRunner_Dataflow_Streaming_PR -Run Dataflow ValidatesRunner Java 11,beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11_PR -Run Dataflow ValidatesRunner Java 17,beam_PostCommit_Java_ValidatesRunner_Dataflow_Java17_PR -Run Dataflow ValidatesRunner,beam_PostCommit_Java_ValidatesRunner_Dataflow_PR -Run Direct Runner Nexmark Tests,beam_PostCommit_Java_Nexmark_Direct_PR -Run Direct ValidatesRunner Java 11,beam_PostCommit_Java_ValidatesRunner_Direct_Java11_PR -Run Direct ValidatesRunner Java 17,beam_PostCommit_Java_ValidatesRunner_Direct_Java17_PR -Run Direct ValidatesRunner,beam_PostCommit_Java_ValidatesRunner_Direct_PR -Run Flink Runner Nexmark Tests,beam_PostCommit_Java_Nexmark_Flink_PR -Run Flink Runner Tpcds Tests,beam_PostCommit_Java_Tpcds_Flink_PR -Run Flink ValidatesRunner Java 11,beam_PostCommit_Java_ValidatesRunner_Flink_Java11_PR -Run Flink ValidatesRunner,beam_PostCommit_Java_ValidatesRunner_Flink_PR -Run Go Flink ValidatesRunner,beam_PostCommit_Go_VR_Flink_PR -Run Go PostCommit,beam_PostCommit_Go_PR -Run Go Samza ValidatesRunner,beam_PostCommit_Go_VR_Samza_PR -Run Go Spark ValidatesRunner,beam_PostCommit_Go_VR_Spark_PR -Run Java 11 Examples on Dataflow Runner V2,beam_PostCommit_Java_Examples_Dataflow_V2_java11_PR -Run Java 17 Examples on Dataflow Runner V2,beam_PostCommit_Java_Examples_Dataflow_V2_java17_PR -Run Java Dataflow V2 ValidatesRunner Streaming,beam_PostCommit_Java_VR_Dataflow_V2_Streaming_PR -Run Java Dataflow V2 ValidatesRunner,beam_PostCommit_Java_VR_Dataflow_V2_PR -Run Java Examples on Dataflow Runner V2,beam_PostCommit_Java_Examples_Dataflow_V2_PR -Run Java Examples_Direct,beam_PostCommit_Java_Examples_Direct_PR -Run Java Examples_Flink,beam_PostCommit_Java_Examples_Flink_PR -Run Java Examples_Spark,beam_PostCommit_Java_Examples_Spark_PR -Run Java Flink PortableValidatesRunner Streaming,beam_PostCommit_Java_PVR_Flink_Streaming_PR -Run Java InfluxDbIO_IT,beam_PostCommit_Java_InfluxDbIO_IT_PR -Run Java PostCommit,beam_PostCommit_Java_PR -Run Java PreCommit,beam_PreCommit_Java_Phrase -Run Java Samza PortableValidatesRunner,beam_PostCommit_Java_PVR_Samza_PR -Run Java Sickbay,beam_PostCommit_Java_Sickbay_PR -Run Java SingleStoreIO_IT,beam_PostCommit_Java_SingleStoreIO_IT_PR -Run Java Spark PortableValidatesRunner Batch,beam_PostCommit_Java_PVR_Spark_Batch_PR -Run Java Spark v3 PortableValidatesRunner Streaming,beam_PostCommit_Java_PVR_Spark3_Streaming_PR -Run Java examples on Dataflow Java 11,beam_PostCommit_Java_Examples_Dataflow_Java11_PR -Run Java examples on Dataflow Java 17,beam_PostCommit_Java_Examples_Dataflow_Java17_PR -Run Java_Amazon-Web-Services2_IO_Direct PreCommit,beam_PreCommit_Java_Amazon-Web-Services2_IO_Direct_Phrase -Run Java_Amazon-Web-Services_IO_Direct PreCommit,beam_PreCommit_Java_Amazon-Web-Services_IO_Direct_Phrase -Run Java_Azure_IO_Direct PreCommit,beam_PreCommit_Java_Azure_IO_Direct_Phrase -Run Java_GCP_IO_Direct PreCommit,beam_PreCommit_Java_GCP_IO_Direct_Phrase -Run Java_Google-ads_IO_Direct PreCommit,beam_PreCommit_Java_Google-ads_IO_Direct_Phrase -Run Java_IOs_Direct PreCommit,beam_PreCommit_Java_IOs_Direct_Phrase -Run Java_Kinesis_IO_Direct PreCommit,beam_PreCommit_Java_Kinesis_IO_Direct_Phrase -Run Java_PVR_Flink_Batch PreCommit,beam_PreCommit_Java_PVR_Flink_Batch_Phrase -Run Java_Pulsar_IO_Direct PreCommit,beam_PreCommit_Java_Pulsar_IO_Direct_Phrase -Run Java_hadoop_IO_Direct PreCommit,beam_PreCommit_Java_Hadoop_IO_Direct_Phrase -Run Javadoc PostCommit,beam_PostCommit_Javadoc_PR -Run Jpms Dataflow Java 11 PostCommit,beam_PostCommit_Java_Jpms_Dataflow_Java11_PR -Run Jpms Dataflow Java 17 PostCommit,beam_PostCommit_Java_Jpms_Dataflow_Java17_PR -Run Jpms Direct Java 11 PostCommit,beam_PostCommit_Java_Jpms_Direct_Java11_PR -Run Jpms Direct Java 17 PostCommit,beam_PostCommit_Java_Jpms_Direct_Java17_PR -Run Jpms Flink Java 11 PostCommit,beam_PostCommit_Java_Jpms_Flink_Java11_PR -Run Jpms Spark Java 11 PostCommit,beam_PostCommit_Java_Jpms_Spark_Java11_PR -Run PortableJar_Flink PostCommit,beam_PostCommit_PortableJar_Flink_PR -Run PortableJar_Spark PostCommit,beam_PostCommit_PortableJar_Spark_PR -Run PostCommit_Java_Avro_Versions,beam_PostCommit_Java_Avro_Versions_PR -Run PostCommit_Java_Dataflow,beam_PostCommit_Java_DataflowV1_PR -Run PostCommit_Java_DataflowV2,beam_PostCommit_Java_DataflowV2_PR -Run PostCommit_Java_Hadoop_Versions,beam_PostCommit_Java_Hadoop_Versions_PR -Run Python 3.10 PostCommit Sickbay,beam_PostCommit_Sickbay_Python310_PR -Run Python 3.10 PostCommit,beam_PostCommit_Python310_PR -Run Python 3.11 PostCommit Sickbay,beam_PostCommit_Sickbay_Python311_PR -Run Python 3.11 PostCommit,beam_PostCommit_Python311_PR -Run Python 3.8 PostCommit Sickbay,beam_PostCommit_Sickbay_Python38_PR -Run Python 3.8 PostCommit,beam_PostCommit_Python38_PR -Run Python 3.9 PostCommit Sickbay,beam_PostCommit_Sickbay_Python39_PR -Run Python 3.9 PostCommit,beam_PostCommit_Python39_PR -Run Python Dataflow ValidatesContainer,beam_PostCommit_Py_ValCont_PR -Run Python Dataflow ValidatesRunner,beam_PostCommit_Py_VR_Dataflow_PR -Run Python Direct Runner Nexmark Tests,beam_PostCommit_Python_Nexmark_Direct_PR -Run Python Examples_Dataflow,beam_PostCommit_Python_Examples_Dataflow_PR -Run Python Examples_Direct,beam_PostCommit_Python_Examples_Direct_PR -Run Python Examples_Flink,beam_PostCommit_Python_Examples_Flink_PR -Run Python Examples_Spark,beam_PostCommit_Python_Examples_Spark_PR -Run Python Flink ValidatesRunner,beam_PostCommit_Python_VR_Flink_PR -Run Python MongoDBIO_IT,beam_PostCommit_Python_MongoDBIO_IT_PR -Run Python PreCommit,beam_PreCommit_Python_Phrase -Run Python RC Dataflow ValidatesContainer,beam_PostCommit_Py_ValCont_with_RC_PR -Run Python Samza ValidatesRunner,beam_PostCommit_Python_VR_Samza_PR -Run Python Spark ValidatesRunner,beam_PostCommit_Python_VR_Spark_PR -Run PythonDocker PreCommit,beam_PreCommit_PythonDocker_Phrase -Run Python_Coverage PreCommit,beam_PreCommit_Python_Coverage_Phrase -Run Python_Dataframes PreCommit,beam_PreCommit_Python_Dataframes_Phrase -Run Python_Examples PreCommit,beam_PreCommit_Python_Examples_Phrase -Run Python_Integration PreCommit,beam_PreCommit_Python_Integration_Phrase -Run Python_PVR_Flink PreCommit,beam_PreCommit_Python_PVR_Flink_Phrase -Run Python_Runners PreCommit,beam_PreCommit_Python_Runners_Phrase -Run Python_Transforms PreCommit,beam_PreCommit_Python_Transforms_Phrase -Run Python_Xlang_Gcp_Dataflow PostCommit,beam_PostCommit_Python_Xlang_Gcp_Dataflow_PR -Run Python_Xlang_Gcp_Direct PostCommit,beam_PostCommit_Python_Xlang_Gcp_Direct_PR -Run Python_Xlang_IO_Dataflow PostCommit,beam_PostCommit_Python_Xlang_IO_Dataflow_PR -Run SQL PostCommit,beam_PostCommit_SQL_PR -Run Samza ValidatesRunner,beam_PostCommit_Java_ValidatesRunner_Samza_PR -Run Spark Runner Nexmark Tests,beam_PostCommit_Java_Nexmark_Spark_PR -Run Spark Runner Tpcds Tests,beam_PostCommit_Java_Tpcds_Spark_PR -Run Spark StructuredStreaming ValidatesRunner,beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming_PR -Run Spark ValidatesRunner Java 11,beam_PostCommit_Java_ValidatesRunner_Spark_Java11_PR -Run Spark ValidatesRunner,beam_PostCommit_Java_ValidatesRunner_Spark_PR -Run TransformService_Direct PostCommit,beam_PostCommit_TransformService_Direct_PR -Run Twister2 ValidatesRunner,beam_PostCommit_Java_ValidatesRunner_Twister2_PR -Run ULR Loopback ValidatesRunner,beam_PostCommit_Java_ValidatesRunner_ULR_PR -Run XVR_Direct PostCommit,beam_PostCommit_XVR_Direct_PR -Run XVR_Flink PostCommit,beam_PostCommit_XVR_Flink_PR -Run XVR_GoUsingJava_Dataflow PostCommit,beam_PostCommit_XVR_GoUsingJava_Dataflow_PR -Run XVR_JavaUsingPython_Dataflow PostCommit,beam_PostCommit_XVR_JavaUsingPython_Dataflow_PR -Run XVR_PythonUsingJavaSQL_Dataflow PostCommit,beam_PostCommit_XVR_PythonUsingJavaSQL_Dataflow_PR -Run XVR_PythonUsingJava_Dataflow PostCommit,beam_PostCommit_XVR_PythonUsingJava_Dataflow_PR -Run XVR_Samza PostCommit,beam_PostCommit_XVR_Samza_PR -Run XVR_Spark3 PostCommit,beam_PostCommit_XVR_Spark3_PR diff --git a/release/src/main/scripts/mass_comment.py b/release/src/main/scripts/mass_comment.py deleted file mode 100644 index 7dec20dbbba5..000000000000 --- a/release/src/main/scripts/mass_comment.py +++ /dev/null @@ -1,176 +0,0 @@ -# -# Licensed to the Apache Software Foundation (ASF) under one or more -# contributor license agreements. See the NOTICE file distributed with -# this work for additional information regarding copyright ownership. -# The ASF licenses this file to You under the Apache License, Version 2.0 -# (the "License"); you may not use this file except in compliance with -# the License. You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# - -"""Script for mass-commenting Jenkins test triggers on a Beam PR.""" - -import os -import requests -import socket -import time - - -def executeGHGraphqlQuery(accessToken, query): - '''Runs graphql query on GitHub.''' - url = 'https://api.github.com/graphql' - headers = {'Authorization': 'Bearer %s' % accessToken} - r = requests.post(url=url, json={'query': query}, headers=headers) - return r.json() - - -def getSubjectId(accessToken, prNumber): - query = ''' -query FindPullRequestID { - repository(owner:"apache", name:"beam") { - pullRequest(number:%s) { - id - } - } -} -''' % prNumber - response = executeGHGraphqlQuery(accessToken, query) - return response['data']['repository']['pullRequest']['id'] - - -def addPrComment(accessToken, subjectId, commentBody): - '''Adds a pr comment to the PR defined by subjectId''' - query = ''' -mutation AddPullRequestComment { - addComment(input:{subjectId:"%s",body: "%s"}) { - commentEdge { - node { - createdAt - body - } - } - subject { - id - } - } -} -''' % (subjectId, commentBody) - return executeGHGraphqlQuery(accessToken, query) - -def getPrStatuses(accessToken, prNumber): - query = ''' -query GetPRChecks { - repository(name: "beam", owner: "apache") { - pullRequest(number: %s) { - commits(last: 1) { - nodes { - commit { - status { - contexts { - targetUrl - context - } - } - } - } - } - } - } -} -''' % (prNumber) - return executeGHGraphqlQuery(accessToken, query) - - -def postComments(accessToken, subjectId, commentsToAdd): - ''' - Main workhorse method. Posts comments to GH. - ''' - - for comment in commentsToAdd: - jsonData = addPrComment(accessToken, subjectId, comment[0]) - print(jsonData) - # Space out comments 30 seconds apart to avoid overwhelming Jenkins - time.sleep(30) - - -def probeGitHubIsUp(): - ''' - Returns True if GitHub responds to simple queries. Else returns False. - ''' - sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) - result = sock.connect_ex(('github.com', 443)) - return True if result == 0 else False - -def getRemainingComments(accessToken, pr, initialComments): - ''' - Filters out the comments that already have statuses associated with them from initial comments - ''' - queryResult = getPrStatuses(accessToken, pr) - pull = queryResult["data"]["repository"]["pullRequest"] - commit = pull["commits"]["nodes"][0]["commit"] - check_urls = str(list(map(lambda c : c["targetUrl"], commit["status"]["contexts"]))) - remainingComments = [] - for comment in initialComments: - if f'/{comment[1]}_Phrase/' not in check_urls and f'/{comment[1]}_PR/' not in check_urls \ - and f'/{comment[1]}_Commit/' not in check_urls and f'/{comment[1]}/' not in check_urls \ - and 'Sickbay' not in comment[1]: - print(comment) - remainingComments.append(comment) - return remainingComments - -################################################################################ -if __name__ == '__main__': - ''' - This script is supposed to be invoked directly. - However for testing purposes and to allow importing, - wrap work code in module check. - ''' - print("Started.") - comments = [] - dirname = os.path.dirname(__file__) - with open(os.path.join(dirname, 'jenkins_jobs.txt')) as file: - comments = [line.strip() for line in file if len(line.strip()) > 0] - - for i in range(len(comments)): - parts = comments[i].split(',') - comments[i] = (parts[0], parts[1]) - - if not probeGitHubIsUp(): - print("GitHub is unavailable, skipping fetching data.") - exit() - - print("GitHub is available start fetching data.") - - accessToken = input("Enter your Github access token: ") - - pr = input("Enter the Beam PR number to test (e.g. 11403): ") - subjectId = getSubjectId(accessToken, pr) - - # TODO(yathu): also auto rerun failed GitHub Action workflow - remainingComments = getRemainingComments(accessToken, pr, comments) - if len(remainingComments) == 0: - print('Jobs have been started for all comments. If you would like to retry all jobs, create a new commit before running this script.') - while len(remainingComments) > 0: - postComments(accessToken, subjectId, remainingComments) - # Sleep 60 seconds to allow checks to start to status - time.sleep(60) - remainingComments = getRemainingComments(accessToken, pr, remainingComments) - if len(remainingComments) > 0: - print(f'{len(remainingComments)} comments must be reposted because no check has been created for them: {str(remainingComments)}') - print('Sleeping for 1 hour to allow Jenkins to recover and to give it time to status.') - for i in range(60): - time.sleep(60) - print(f'{i} minutes elapsed, {60-i} minutes remaining') - remainingComments = getRemainingComments(accessToken, pr, remainingComments) - if len(remainingComments) == 0: - print(f'{len(remainingComments)} comments still must be reposted: {str(remainingComments)}') - print('Trying to repost comments.') - - print('Done.') diff --git a/scripts/beam-sql.sh b/scripts/beam-sql.sh index 38907b0d2d26..b83e8bc4ecae 100755 --- a/scripts/beam-sql.sh +++ b/scripts/beam-sql.sh @@ -197,7 +197,6 @@ function list_runners() { echo " dataflow - DataflowRunner (runs on Google Cloud Dataflow)" echo " flink - FlinkRunner (runs on Apache Flink)" echo " spark - SparkRunner (runs on Apache Spark)" - echo " samza - SamzaRunner (runs on Apache Samza)" echo " jet - JetRunner (runs on Hazelcast Jet)" echo " twister2 - Twister2Runner (runs on Twister2)" echo "" @@ -239,10 +238,6 @@ function list_runners() { echo " spark-3 - SparkRunner (Spark 3.x)" echo " Runs on Apache Spark 3.x clusters." ;; - "samza") - echo " samza - SamzaRunner" - echo " Runs on Apache Samza." - ;; "jet") echo " jet - JetRunner" echo " Runs on Hazelcast Jet." diff --git a/scripts/ci/release/comment_pr_trigger_phrases.sh b/scripts/ci/release/comment_pr_trigger_phrases.sh deleted file mode 100755 index f31b3054e70e..000000000000 --- a/scripts/ci/release/comment_pr_trigger_phrases.sh +++ /dev/null @@ -1,27 +0,0 @@ -#!/bin/bash -# -# Licensed to the Apache Software Foundation (ASF) under one or more -# contributor license agreements. See the NOTICE file distributed with -# this work for additional information regarding copyright ownership. -# The ASF licenses this file to You under the Apache License, Version 2.0 -# (the "License"); you may not use this file except in compliance with -# the License. You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# -# This script adds comments from a txt file, in the given GitHub Pull Request. -# test/resources/mass_comment.txt file has the Trigger Phrases for running Gradle Build and PostCommit/PreCommit tests. - -file="./test/resources/mass_comment.txt" -GITHUB_PR=$1 -while IFS= read -r trigger_phrase -do - gh pr comment "$GITHUB_PR" --body "$trigger_phrase" - sleep 30 -done <"$file" diff --git a/scripts/ci/release/test/resources/jenkins_trigger_phrases.txt b/scripts/ci/release/test/resources/jenkins_trigger_phrases.txt deleted file mode 100644 index 4f59885de8b4..000000000000 --- a/scripts/ci/release/test/resources/jenkins_trigger_phrases.txt +++ /dev/null @@ -1,19 +0,0 @@ -Run Release Gradle Build -Run CommunityMetrics PreCommit -Run Direct ValidatesRunner Java 11 -Run Direct ValidatesRunner Java 17 -Run Java 11 Examples on Dataflow Runner V2 -Run Java 17 Examples on Dataflow Runner V2 -Run Java PreCommit -Run Java examples on Dataflow Java 11 -Run Java examples on Dataflow Java 17 -Run Javadoc PostCommit -Run Jpms Dataflow Java 17 PostCommit -Run Jpms Direct Java 17 PostCommit -Run Python Flink ValidatesRunner -Run Python PreCommit -Run Python Samza ValidatesRunner -Run Python Spark ValidatesRunner -Run SQL_Java17 PreCommit -Run XVR_Flink PostCommit -Run XVR_GoUsingJava_Dataflow PostCommit diff --git a/scripts/ci/release/test/resources/mass_comment.txt b/scripts/ci/release/test/resources/mass_comment.txt deleted file mode 100644 index 1f6f340eb0b7..000000000000 --- a/scripts/ci/release/test/resources/mass_comment.txt +++ /dev/null @@ -1,106 +0,0 @@ -Run Release Gradle Build -Run CommunityMetrics PreCommit -Run Dataflow Runner Nexmark Tests -Run Dataflow Runner V2 Java 11 Nexmark Tests -Run Dataflow Runner V2 Java 17 Nexmark Tests -Run Dataflow Runner V2 Nexmark Tests -Run Dataflow Streaming ValidatesRunner -Run Dataflow ValidatesRunner Java 11 -Run Dataflow ValidatesRunner Java 17 -Run Dataflow ValidatesRunner -Run Direct Runner Nexmark Tests -Run Direct ValidatesRunner Java 11 -Run Direct ValidatesRunner Java 17 -Run Direct ValidatesRunner in Java 11 -Run Direct ValidatesRunner -Run Flink Runner Nexmark Tests -Run Flink ValidatesRunner Java 11 -Run Flink ValidatesRunner -Run Go Flink ValidatesRunner -Run Go PostCommit -Run Go PreCommit -Run Go Samza ValidatesRunner -Run Go Spark ValidatesRunner -Run GoPortable PreCommit -Run Java 11 Examples on Dataflow Runner V2 -Run Java 17 Examples on Dataflow Runner V2 -Run Java Dataflow V2 ValidatesRunner Streaming -Run Java Dataflow V2 ValidatesRunner -Run Java Examples on Dataflow Runner V2 -Run Java Examples_Direct -Run Java Examples_Flink -Run Java Examples_Spark -Run Java Flink PortableValidatesRunner Streaming -Run Java Portability examples on Dataflow with Java 11 -Run Java PostCommit -Run Java PreCommit -Run Java Samza PortableValidatesRunner -Run Java Spark PortableValidatesRunner Batch -Run Java Spark v2 PortableValidatesRunner Streaming -Run Java Spark v3 PortableValidatesRunner Streaming -Run Java examples on Dataflow Java 11 -Run Java examples on Dataflow Java 17 -Run Java examples on Dataflow with Java 11 -Run Java_Examples_Dataflow PreCommit -Run Java_Examples_Dataflow_Java11 PreCommit -Run Java_Examples_Dataflow_Java17 PreCommit -Run Java_PVR_Flink_Batch PreCommit -Run Java_PVR_Flink_Docker PreCommit -Run Javadoc PostCommit -Run Jpms Dataflow Java 11 PostCommit -Run Jpms Dataflow Java 17 PostCommit -Run Jpms Direct Java 11 PostCommit -Run Jpms Direct Java 17 PostCommit -Run Jpms Flink Java 11 PostCommit -Run Jpms Spark Java 11 PostCommit -Run PortableJar_Flink PostCommit -Run PortableJar_Spark PostCommit -Run Portable_Python PreCommit -Run PostCommit_Java_Dataflow -Run PostCommit_Java_DataflowV2 -Run PostCommit_Java_Hadoop_Versions -Run Python 3.8 PostCommit -Run Python 3.9 PostCommit -Run Python 3.10 PostCommit -Run Python 3.11 PostCommit -Run Python Dataflow V2 ValidatesRunner -Run Python Dataflow ValidatesContainer -Run Python Dataflow ValidatesRunner -Run Python Examples_Dataflow -Run Python Examples_Direct -Run Python Examples_Flink -Run Python Examples_Spark -Run Python Flink ValidatesRunner -Run Python PreCommit -Run Python Samza ValidatesRunner -Run Python Spark ValidatesRunner -Run PythonDocker PreCommit -Run PythonDocs PreCommit -Run PythonFormatter PreCommit -Run PythonLint PreCommit -Run Python_PVR_Flink PreCommit -Run Python_Xlang_Gcp_Direct PostCommit -Run RAT PreCommit -Run SQL PostCommit -Run SQL PreCommit -Run SQL_Java11 PreCommit -Run SQL_Java17 PreCommit -Run Samza ValidatesRunner -Run Spark Runner Nexmark Tests -Run Spark StructuredStreaming ValidatesRunner -Run Spark ValidatesRunner Java 11 -Run Spark ValidatesRunner -Run Spotless PreCommit -Run Twister2 ValidatesRunner -Run Typescript PreCommit -Run ULR Loopback ValidatesRunner -Run Whitespace PreCommit -Run Xlang_Generated_Transforms PreCommit -Run XVR_Direct PostCommit -Run XVR_Flink PostCommit -Run XVR_JavaUsingPython_Dataflow PostCommit -Run XVR_PythonUsingJavaSQL_Dataflow PostCommit -Run XVR_PythonUsingJava_Dataflow PostCommit -Run XVR_Samza PostCommit -Run XVR_Spark PostCommit -Run XVR_Spark3 PostCommit diff --git a/website/www/site/data/capability_matrix.yaml b/website/www/site/data/capability_matrix.yaml index 3a753262eeae..b7c236865ef3 100644 --- a/website/www/site/data/capability_matrix.yaml +++ b/website/www/site/data/capability_matrix.yaml @@ -30,8 +30,6 @@ capability-matrix: name: Twister2 - class: python direct name: Python Direct FnRunner - - class: go direct - name: Go Direct Runner categories: - description: What is being computed? @@ -82,10 +80,6 @@ capability-matrix: l1: "" l2: l3: "" - - class: go direct - l1: "" - l2: - l3: "" - name: GroupByKey description: Grouping of key-value pairs per key, window, and pane. (See also other tabs.) values: @@ -125,10 +119,6 @@ capability-matrix: l1: "" l2: l3: "" - - class: go direct - l1: "" - l2: - l3: "" - name: Flatten description: Concatenates multiple homogenously typed collections together. values: @@ -168,10 +158,6 @@ capability-matrix: l1: "" l2: l3: "" - - class: go direct - l1: "" - l2: - l3: "" - name: Combine description: 'Application of an associative, commutative operation over all values ("globally") or over all values associated with each key ("per key"). Can be implemented using ParDo, but often more efficient implementations exist.' values: @@ -211,10 +197,6 @@ capability-matrix: l1: "" l2: l3: "" - - class: go direct - l1: "" - l2: - l3: "" - name: Composite Transforms description: Allows easy extensibility for library writers. In the near future, we expect there to be more information provided at this level -- customized metadata hooks for monitoring, additional runtime/environment hooks, etc. values: @@ -254,10 +236,6 @@ capability-matrix: l1: "" l2: l3: "" - - class: go direct - l1: "" - l2: - l3: "" - name: Side Inputs description: Side inputs are additional PCollections whose contents are computed during pipeline execution and then made accessible to DoFn code. The exact shape of the side input depends both on the PCollectionView used to describe the access pattern (interable, map, singleton) and the window of the element from the main input that is currently being processed. values: @@ -297,10 +275,6 @@ capability-matrix: l1: "" l2: l3: "" - - class: go direct - l1: "" - l2: - l3: "" - name: Source API description: Allows users to provide additional input sources. Supports both bounded and unbounded data. Includes hooks necessary to provide efficient parallelization (size estimation, progress information, dynamic splitting, etc). values: @@ -340,10 +314,6 @@ capability-matrix: l1: "" l2: l3: "" - - class: go direct - l1: "" - l2: - l3: "" - name: Metrics description: Allow transforms to gather simple metrics across bundles in a PTransform. Provide a mechanism to obtain both committed and attempted metrics. Semantically similar to using an additional output, but support partial results as the transform executes, and support both committed and attempted values. Will likely want to augment Metrics to be more useful for processing unbounded data by making them windowed. values: @@ -383,10 +353,6 @@ capability-matrix: l1: "" l2: l3: "" - - class: go direct - l1: "" - l2: - l3: "" - name: Stateful Processing description: Allows fine-grained access to per-key, per-window persistent state. Necessary for certain use cases (e.g. high-volume windows which store large amounts of data, but typically only access small portions of it; complex state machines; etc.) that are not easily or efficiently addressed via Combine or GroupByKey+ParDo. values: @@ -410,10 +376,6 @@ capability-matrix: l1: "No" l2: not implemented l3: - - class: samza - l1: "Partially" - l2: non-merging windows - l3: "States are backed up by either rocksDb KV store or in-memory hash map, and persist using changelog." - class: nemo l1: "No" l2: not implemented @@ -430,10 +392,6 @@ capability-matrix: l1: "" l2: l3: "" - - class: go direct - l1: "" - l2: - l3: "" - description: Bounded Splittable DoFn Support Status anchor: what color-y: "fff" @@ -466,10 +424,6 @@ capability-matrix: l1: l2: l3: "" - - class: samza - l1: - l2: - l3: "" - class: nemo l1: l2: @@ -486,10 +440,6 @@ capability-matrix: l1: "Yes" l2: l3: - - class: go direct - l1: "Yes" - l2: - l3: - name: Side Inputs description: "" values: @@ -513,10 +463,6 @@ capability-matrix: l1: l2: l3: "" - - class: samza - l1: - l2: - l3: "" - class: nemo l1: l2: @@ -533,10 +479,6 @@ capability-matrix: l1: l2: l3: - - class: go direct - l1: "Yes" - l2: - l3: - name: Splittable DoFn Initiated Checkpointing description: "" values: @@ -560,10 +502,6 @@ capability-matrix: l1: l2: l3: "" - - class: samza - l1: - l2: - l3: "" - class: nemo l1: l2: @@ -580,10 +518,6 @@ capability-matrix: l1: "Yes" l2: l3: - - class: go direct - l1: "No" - l2: - l3: - name: Dynamic Splitting description: "" values: @@ -607,10 +541,6 @@ capability-matrix: l1: l2: l3: "" - - class: samza - l1: - l2: - l3: "" - class: nemo l1: l2: @@ -627,10 +557,6 @@ capability-matrix: l1: "Yes" l2: Only with Python SDK l3: - - class: go direct - l1: "No" - l2: - l3: - name: Bundle Finalization description: "" values: @@ -654,10 +580,6 @@ capability-matrix: l1: l2: l3: "" - - class: samza - l1: - l2: - l3: "" - class: nemo l1: l2: @@ -674,10 +596,6 @@ capability-matrix: l1: "Yes" l2: l3: - - class: go direct - l1: "No" - l2: - l3: - description: Unbounded Splittable DoFn Support Status anchor: what color-y: "fff" @@ -710,10 +628,6 @@ capability-matrix: l1: l2: l3: "" - - class: samza - l1: - l2: - l3: "" - class: nemo l1: l2: @@ -730,10 +644,6 @@ capability-matrix: l1: "Yes" l2: l3: - - class: go direct - l1: "No" - l2: - l3: - name: Side Inputs description: "" values: @@ -757,10 +667,6 @@ capability-matrix: l1: l2: l3: "" - - class: samza - l1: - l2: - l3: "" - class: nemo l1: l2: @@ -777,10 +683,6 @@ capability-matrix: l1: l2: l3: - - class: go direct - l1: "Yes" - l2: - l3: - name: Splittable DoFn Initiated Checkpointing description: "" values: @@ -804,10 +706,6 @@ capability-matrix: l1: l2: l3: "" - - class: samza - l1: - l2: - l3: "" - class: nemo l1: l2: @@ -824,10 +722,6 @@ capability-matrix: l1: "Yes" l2: l3: - - class: go direct - l1: "No" - l2: - l3: - name: Dynamic Splitting description: "" values: @@ -851,10 +745,6 @@ capability-matrix: l1: l2: l3: "" - - class: samza - l1: - l2: - l3: "" - class: nemo l1: l2: @@ -871,10 +761,6 @@ capability-matrix: l1: "No" l2: l3: - - class: go direct - l1: "No" - l2: - l3: - name: Bundle Finalization description: "" values: @@ -898,10 +784,6 @@ capability-matrix: l1: l2: l3: "" - - class: samza - l1: - l2: - l3: "" - class: nemo l1: l2: @@ -918,10 +800,6 @@ capability-matrix: l1: "Yes" l2: l3: - - class: go direct - l1: "No" - l2: - l3: - description: Where in event time? anchor: where color-y: "fff" @@ -954,10 +832,6 @@ capability-matrix: l1: "Partially" l2: fully supported in batch mode l3: "" - - class: samza - l1: "Yes" - l2: supported - l3: "" - class: nemo l1: "Yes" l2: supported @@ -970,6 +844,10 @@ capability-matrix: l1: "Yes" l2: supported l3: "" + - class: python direct + l1: "Yes" + l2: supported + l3: "" - name: Fixed windows description: Fixed-size, timestamp-based windows. (Hourly, Daily, etc) values: @@ -993,10 +871,6 @@ capability-matrix: l1: "Partially" l2: fully supported in batch mode l3: "" - - class: samza - l1: "Yes" - l2: supported - l3: "" - class: nemo l1: "Yes" l2: supported @@ -1009,6 +883,10 @@ capability-matrix: l1: "Yes" l2: supported l3: "" + - class: python direct + l1: "Yes" + l2: supported + l3: "" - name: Sliding windows description: Possibly overlapping fixed-size timestamp-based windows (Every minute, use the last ten minutes of data.) values: @@ -1032,10 +910,6 @@ capability-matrix: l1: "Partially" l2: fully supported in batch mode l3: "" - - class: samza - l1: "Yes" - l2: supported - l3: "" - class: nemo l1: "Yes" l2: supported @@ -1048,6 +922,10 @@ capability-matrix: l1: "Yes" l2: supported l3: "" + - class: python direct + l1: "Yes" + l2: supported + l3: "" - name: Session windows description: Based on bursts of activity separated by a gap size. Different per key. values: @@ -1071,10 +949,6 @@ capability-matrix: l1: "Partially" l2: fully supported in batch mode l3: "" - - class: samza - l1: "Yes" - l2: supported - l3: "" - class: nemo l1: "Yes" l2: supported @@ -1087,6 +961,10 @@ capability-matrix: l1: "Yes" l2: supported l3: "" + - class: python direct + l1: "Yes" + l2: supported + l3: "" - name: Custom windows description: All windows must implement BoundedWindow, which specifies a max timestamp. Each WindowFn assigns elements to an associated window. values: @@ -1110,10 +988,6 @@ capability-matrix: l1: "Partially" l2: fully supported in batch mode l3: "" - - class: samza - l1: "Yes" - l2: supported - l3: "" - class: nemo l1: "Yes" l2: supported @@ -1126,6 +1000,10 @@ capability-matrix: l1: "Yes" l2: supported l3: "" + - class: python direct + l1: "Yes" + l2: supported + l3: "" - name: Custom merging windows description: A custom WindowFn additionally specifies whether and how to merge windows. values: @@ -1149,10 +1027,6 @@ capability-matrix: l1: "Partially" l2: fully supported in batch mode l3: "" - - class: samza - l1: "Yes" - l2: supported - l3: "" - class: nemo l1: "Yes" l2: supported @@ -1165,6 +1039,10 @@ capability-matrix: l1: "Yes" l2: supported l3: "" + - class: python direct + l1: "Yes" + l2: supported + l3: "" - name: Timestamp control description: For a grouping transform, such as GBK or Combine, an OutputTimeFn specifies (1) how to combine input timestamps within a window and (2) how to merge aggregated timestamps when windows merge. values: @@ -1188,10 +1066,6 @@ capability-matrix: l1: "Partially" l2: fully supported in batch mode l3: "" - - class: samza - l1: "Yes" - l2: supported - l3: "" - class: nemo l1: "Yes" l2: supported @@ -1204,6 +1078,10 @@ capability-matrix: l1: "Yes" l2: supported l3: "" + - class: python direct + l1: "Yes" + l2: supported + l3: "" - description: When in processing time? anchor: when @@ -1237,10 +1115,6 @@ capability-matrix: l1: "Partially" l2: fully supported in batch mode l3: "" - - class: samza - l1: "Yes" - l2: fully supported - l3: "" - class: nemo l1: "Yes" l2: fully supported @@ -1253,7 +1127,10 @@ capability-matrix: l1: "Yes" l2: fully supported l3: "" - + - class: python direct + l1: "Yes" + l2: fully supported + l3: "" - name: Event-time triggers description: Triggers that fire in response to event-time completeness signals, such as watermarks progressing. values: @@ -1277,10 +1154,6 @@ capability-matrix: l1: "Partially" l2: fully supported in batch mode l3: "" - - class: samza - l1: "Yes" - l2: fully supported - l3: "" - class: nemo l1: "Yes" l2: fully supported @@ -1293,6 +1166,10 @@ capability-matrix: l1: "Yes" l2: fully supported l3: "" + - class: python direct + l1: "Yes" + l2: fully supported + l3: "" - name: Processing-time triggers description: Triggers that fire in response to processing-time advancing. @@ -1317,10 +1194,6 @@ capability-matrix: l1: "Partially" l2: fully supported in batch mode l3: - - class: samza - l1: "Yes" - l2: fully supported - l3: "" - class: nemo l1: "Yes" l2: fully supported @@ -1333,6 +1206,10 @@ capability-matrix: l1: "Yes" l2: fully supported l3: "" + - class: python direct + l1: "Yes" + l2: fully supported + l3: "" - name: Count triggers description: Triggers that fire after seeing at least N elements. @@ -1357,10 +1234,6 @@ capability-matrix: l1: "Partially" l2: fully supported in batch mode l3: "" - - class: samza - l1: "Yes" - l2: fully supported - l3: "" - class: nemo l1: "Yes" l2: fully supported @@ -1373,6 +1246,10 @@ capability-matrix: l1: "Yes" l2: fully supported l3: "" + - class: python direct + l1: "Yes" + l2: fully supported + l3: "" - name: Composite triggers description: Triggers which compose other triggers in more complex structures, such as logical AND, logical OR, early/on-time/late, etc. @@ -1397,10 +1274,6 @@ capability-matrix: l1: "Partially" l2: fully supported in batch mode l3: "" - - class: samza - l1: "Yes" - l2: fully supported - l3: "" - class: nemo l1: "Yes" l2: fully supported @@ -1413,6 +1286,10 @@ capability-matrix: l1: "Partially" l2: l3: "" + - class: python direct + l1: "Yes" + l2: fully supported + l3: "" - name: Allowed lateness description: A way to bound the useful lifetime of a window (in event time), after which any unemitted results may be materialized, the window contents may be garbage collected, and any addtional late data that arrive for the window may be discarded. @@ -1437,10 +1314,6 @@ capability-matrix: l1: "No" l2: no streaming support in the runner l3: "" - - class: samza - l1: "Yes" - l2: fully supported - l3: "" - class: nemo l1: "Yes" l2: fully supported @@ -1453,6 +1326,10 @@ capability-matrix: l1: "Partially" l2: l3: "" + - class: python direct + l1: "Yes" + l2: fully supported + l3: "" - name: Timers description: A fine-grained mechanism for performing work at some point in the future, in either the event-time or processing-time domain. Useful for orchestrating delayed events, timeouts, etc in complex state per-key, per-window state machines. @@ -1477,10 +1354,6 @@ capability-matrix: l1: "No" l2: not implemented l3: "" - - class: samza - l1: "Partially" - l2: non-merging windows - l3: The Samza Runner supports timers in non-merging windows. - class: nemo l1: "No" l2: not implemented @@ -1493,6 +1366,10 @@ capability-matrix: l1: "Partially" l2: l3: "" + - class: python direct + l1: "Yes" + l2: "Partially" + l3: "" - description: How do refinements relate? anchor: how @@ -1526,10 +1403,6 @@ capability-matrix: l1: "Partially" l2: fully supported in batch mode l3: "" - - class: samza - l1: "Yes" - l2: fully supported - l3: "" - class: nemo l1: "Yes" l2: fully supported @@ -1542,6 +1415,10 @@ capability-matrix: l1: "Yes" l2: fully supported l3: "" + - class: python direct + l1: "Yes" + l2: fully supported + l3: "" - name: Accumulating description: Elements are accumulated in state across multiple pane firings for the same window. @@ -1566,10 +1443,6 @@ capability-matrix: l1: "No" l2: "" l3: "" - - class: samza - l1: "Yes" - l2: fully supported - l3: "" - class: nemo l1: "Yes" l2: fully supported @@ -1582,6 +1455,10 @@ capability-matrix: l1: "Yes" l2: fully supported l3: "" + - class: python direct + l1: "Yes" + l2: fully supported + l3: "" - description: Additional common features not yet part of the Beam model anchor: misc @@ -1615,10 +1492,6 @@ capability-matrix: l1: l2: l3: - - class: samza - l1: - l2: - l3: - class: nemo l1: l2: @@ -1631,6 +1504,10 @@ capability-matrix: l1: l2: l3: + - class: python direct + l1: + l2: + l3: - name: Checkpoint description: APIs and semantics for saving a pipeline checkpoint are under discussion. This would be a runner-specific materialization of the pipeline state required to resume or duplicate the pipeline. values: @@ -1654,10 +1531,6 @@ capability-matrix: l1: "No" l2: l3: not implemented - - class: samza - l1: "Partially" - l2: - l3: Samza has a native checkpoint capability. - class: nemo l1: l2: @@ -1670,6 +1543,10 @@ capability-matrix: l1: l2: l3: + - class: python direct + l1: + l2: + l3: - name: Key-ordered delivery description: The runner offers guarantees for the order in which elements are passed in between operations. See per-key ordering semantics. values: @@ -1681,10 +1558,6 @@ capability-matrix: l1: "Yes" l2: fully supported l3: "" - - class: prism - l1: "Unverified" - l2: - l3: - class: flink l1: "Partially" l2: @@ -1701,10 +1574,6 @@ capability-matrix: l1: "Unverified" l2: l3: - - class: samza - l1: "Partially" - l2: - l3: Samza may perform different shuffling algorithms for batch and streaming. Samza guarantees key-ordered delivery in streaming, though not in batch. - class: nemo l1: "Unverified" l2: diff --git a/website/www/site/layouts/partials/section-menu/en/roadmap.html b/website/www/site/layouts/partials/section-menu/en/roadmap.html index 8c8d14ecce28..09c326705d9f 100644 --- a/website/www/site/layouts/partials/section-menu/en/roadmap.html +++ b/website/www/site/layouts/partials/section-menu/en/roadmap.html @@ -33,7 +33,6 @@
  • Flink Runner
  • Nemo Runner
  • Spark Runner
  • -
  • Samza Runner
  • Twister2 Runner