apache · markap14 · Sep 2, 2025 · Mar 10, 2026 · Mar 11, 2026 · Mar 16, 2026
diff --git a/.cursor/rules/building.mdc b/.cursor/rules/building.mdc
@@ -0,0 +1,32 @@
+---
+description: Maven build instructions for the NiFi codebase
+alwaysApply: true
+---
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+      http://www.apache.org/licenses/LICENSE-2.0
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+
+# Building
+
+NiFi is a complex Maven codebase. Never build code (testing or otherwise) using javac.
+Always use `mvn` instead, or preferably the `.mvnw` wrapper script.
+
+Additionally, building a maven module using the also-make flag (`-am`) is often very
+expensive and slow. Instead, only build the specific module you are modifying. Assume that
+the user has already built the entire codebase and that only the specific module you are
+modifying needs to be built again. If this fails, you can prompt the user to build the entire
+codebase, but only after you have attempted to build the relevant modules yourself first.
+It is important not to run `mvn clean` at the root level or at the `nifi-assembly` level without
+the user's express permission, as this may delete a running instance of NiFi, causing permanent
+loss of flows and configuration.
diff --git a/.cursor/rules/code-style.mdc b/.cursor/rules/code-style.mdc
@@ -0,0 +1,88 @@
+---
+description: Java code style conventions for the NiFi codebase
+globs: "**/*.java"
+alwaysApply: false
+---
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+      http://www.apache.org/licenses/LICENSE-2.0
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+
+# Code Style
+
+NiFi adheres to a few code styles that are not necessarily common. Please ensure that you
+observe these code styles.
+
+1. Any variable that can be marked `final` must be marked `final`. This includes
+   declarations of Exceptions, method arguments, local variables, member variables, etc.
+2. Short-hand is highly discouraged in names of variables, classes, methods, etc., as well
+   as in documentation. Exceptions to this include in the framework, you may see references to
+   `procNode` for `ProcessorNode` or other such short-hand that is very difficult to confuse with
+   other terms, and it is used only when clearly defined such as `final ProcessorNode procNode = ...`.
+   Even though, however, we would not abbreviate `ControllerService` as `cs` because `cs` is too vague
+   and easily misunderstood. Instead, a value of `serviceNode` might be used.
+3. Private / helper methods should not be placed before the first public/protected method
+   that calls it.
+4. Unless the method is to be heavily reused, avoid creating trivial 1-2 line methods and
+   instead just place the code inline.
+5. Code is allowed to be up to 200 characters wide. Avoid breaking lines into many short lines.
+6. Avoid creating private methods that are called only once unless they are at least 10
+   lines long or are complex.
+7. It is never acceptable to use star imports. Import each individual class that is to be used.
+8. Never use underscores in class names, variables, or filenames.
+9. Never use System.out.println but instead use SLF4J Loggers.
+10. Avoid excessive whitespace in method invocations. For example, instead of writing:
+
+```java
+myObject.doSomething(
+    arg1,
+    arg2,
+    arg3,
+    arg4,
+    arg5
+);
+```
+
+Write this instead:
+
+```java
+myObject.doSomething(arg1, arg2, arg3, arg4, arg5);
+```
+
+It is okay to use many newlines in a builder pattern, such as:
+```java
+final MyObject myObject = MyObject.builder()
+    .arg1(arg1)
+    .arg2(arg2)
+    .arg3(arg3)
+    .build();
+```
+
+It is also acceptable when chaining methods in a functional style such as:
+```java
+final List<String> result = myList.stream()
+    .filter(s -> s.startsWith("A"))
+    .map(String::toUpperCase)
+    .toList();
+```
+
+11. When possible, prefer importing a class, rather than using fully qualified classname
+    inline in the code.
+12. Avoid statically importing methods, except in methods that are frequently used in testing
+    frameworks, such as the `Assertions` and `Mockito` classes.
+13. Avoid trailing whitespace at the end of lines, especially in blank lines.
+14. The `var` keyword is never allowed in the codebase. Always explicitly declare the type of variables.
+15. Prefer procedural code over functional code. For example, prefer using a for loop instead of a stream
+    when the logic is not simple and straightforward. The stream API is powerful but can be difficult to
+    read when overused or used in complex scenarios. Functional style is best used when the logic is simple
+    and chains together no more than 3-4 operations.
diff --git a/.cursor/rules/ending-conditions.mdc b/.cursor/rules/ending-conditions.mdc
@@ -0,0 +1,44 @@
+---
+description: Task completion checklist that must be verified before considering any task done
+alwaysApply: true
+---
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+      http://www.apache.org/licenses/LICENSE-2.0
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+
+# Ending Conditions
+
+When you have completed a task, ensure that you have verified the following:
+
+1. All code compiles and builds successfully using `mvn`.
+2. All relevant unit tests pass successfully using `mvn`.
+3. All code adheres to the Code Style rules.
+4. Checkstyle and PMD pass successfully using
+   `mvn checkstyle:check pmd:check -T 1C` from the appropriate directory.
+5. Unit tests have been added to verify the functionality of any sufficiently complex method.
+6. A system test or an integration test has been added if the change makes significant
+   changes to the framework and the interaction between a significant number of classes.
+7. You have performed a full review of the code to ensure that there are no logical errors
+   and that the code is not duplicative or difficult to understand. If you find any code that
+   is in need of refactoring due to clarity or duplication, you should report this to the user
+   and offer to make those changes as well.
+8. If creating a new Processor or Controller Service, ensure that all relevant annotations
+   have been added, including `@Tags`, `@CapabilityDescription`, `@UseCase`, and
+   `@MultiProcessorUseCase` as appropriate.
+
+
+Do not consider the task complete until all of the above conditions have been met. When you
+do consider the task complete, provide a summary of what you changed and which tests were
+added or modified and what the behavior is that they verify. Additionally, provide any feedback
+about your work that may need further review or that is not entirely complete.
diff --git a/.cursor/rules/extension-development.mdc b/.cursor/rules/extension-development.mdc
@@ -0,0 +1,121 @@
+---
+description: Development patterns for NiFi extensions (Processors, Controller Services, Connectors). Covers Property Descriptors, Relationships, and common patterns.
+alwaysApply: false
+---
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+      http://www.apache.org/licenses/LICENSE-2.0
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+
+# Extension Development
+
+This rule applies when developing NiFi extensions: Processors, Controller Services, and Connectors.
+
+## Property Descriptors
+
+Property Descriptors are defined as `static final` fields on the component class using
+`PropertyDescriptor.Builder`.
+
+- **Naming:** Use clear, descriptive names. The `displayName` field should never be used. Make the
+  name itself clear and concise. Use Title Case for property names.
+- **Required vs. optional:** Mark properties as `.required(true)` when the component cannot
+  function without them. Prefer sensible defaults via `.defaultValue(...)` when possible.
+  When a default value is provided, the property will always have a value. The `required` flag in this
+  case is more of a documentation aid to indicate the importance of the property.
+- **Validators:** Always attach an appropriate `Validator` (e.g., `StandardValidators.NON_EMPTY_VALIDATOR`,
+  `StandardValidators.POSITIVE_INTEGER_VALIDATOR`). The Validator can be left off only when Allowable Values
+  are provided. In this case, do not include a Validator because it is redundant and confusing.
+- **Expression Language:** If a property should support Expression Language, add
+  `.expressionLanguageSupported(ExpressionLanguageScope.FLOWFILE_ATTRIBUTES)` or the
+  appropriate scope. Always document when Expression Language is supported in the property
+  description. Some developers tend to go overboard here and feel like Expression Language should be supported
+  everywhere, but this is a mistake! The default assumption should be that Expression Language is not supported
+  unless the value is expected to be different for every FlowFile that is processed.
+- **Dependencies:** Use `.dependsOn(...)` to conditionally show properties based on the
+  values of other properties. This keeps the configuration UI clean and avoids exposing
+  irrelevant properties. If there is a dependency, it is important to understand that `.required(true)` means that
+  this property is required IF AND ONLY IF the dependency condition is met.
+
+## Processors
+
+- The `onTrigger` method should be focused on processing FlowFiles. Keep setup and teardown
+  logic in lifecycle methods when possible.
+- Prefer `session.read()` and `session.write()` with callbacks over directly working with
+  streams to ensure proper resource management.
+- Prefer `session.commitAsync()` over `session.commit()`. The `commit` method was the original implementation,
+  but it has now been deprecated in favor of `commitAsync`. The `commitAsync` call provide a clearer, cleaner
+  interface for handling post-commit actions including success and failure callbacks. In addition, the async
+  method allows Processors to be used much more efficiently in a Stateless NiFi flow.
+
+### Processor Lifecycle Annotations
+
+- Use `@OnScheduled` for setup that should happen once before the processor starts
+  running (e.g., creating clients, compiling patterns).
+- Use `@OnStopped` for cleanup (e.g., closing clients, releasing resources).
+- `@OnUnscheduled` is rarely used but can be used to interrupt long-running processes when the Processor is stopped.
+  Generally, though, it is preferable to write the Processor in such a way that long-running processes check `isScheduled()`
+  and stop gracefully if the return value is `false`.
+
+### Relationships
+- **Declaration**: Relationships are defined as `static final` fields using `new Relationship.Builder()`.
+  Relationship names should generally be lowercase.
+- **Success and Failure:** Most processors define at least a `success` and `failure`
+  relationship. Use `REL_SUCCESS` and `REL_FAILURE` as constant names.
+- **Original relationship:** Processors that enrich or fork FlowFiles often include an
+  `original` relationship for the unmodified input FlowFile.
+
+### Use Case Documentation
+The `@UseCase` and `@MultiProcessorUseCase` annotations help document common usage patterns for Processors.
+This is helpful for users to understand when and how to use the component effectively. It is equally important
+for Agents that can determine which components should be used for a given task.
+
+- Use `@UseCase` to document common use cases for the Processor. This helps users understand
+  when and how to use the component effectively. This is unnecessary for Processors that serve a single use case
+  that is clearly described by the component name and description. For example, a Processor that consumes messages
+  from a specific service likely does not need a `@UseCase` annotation because its purpose is clear.
+- Use `@MultiProcessorUseCase` to document well-known patterns that involve multiple Processors working
+  together to achieve a common goal. Examples include List/Fetch patterns, Fork/Join patterns, etc.
+  The `@MultiProcessorUseCase` annotation should not be added to each individual Processor involved in the pattern.
+  Rather, the convention is to add the annotation to the last Processor in the flow that completes the pattern.
+  Some Processors will have one or more `@UseCase` annotations and no `@MultiProcessorUseCase` annotations,
+  while some will have one or more `@MultiProcessorUseCase` annotations and no `@UseCase` annotations.
+
+
+## Controller Services
+
+Controller Services are objects that can be shared across multiple components. This is typically done for
+clients that connect to external systems in order to avoid creating many connections, or in order to share
+configuration across multiple components without the user having to duplicate configuration. Controller Services
+can also be helpful for abstracting away some piece of functionality into a separate extension point so that the
+implementation can be swapped out by the user. For example, Record Readers and Writers are implemented as Controller
+Services so that the user can simply choose which format they want to read and write in a flexible and reusable way.
+
+That said, Controller Services can be more onerous to configure and maintain for the user, so they should
+be used sparingly and only when there is a clear benefit to doing so.
+
+### Controller Service Lifecycle Annotations
+
+- Use `@OnScheduled` for setup that should happen once before the service is enabled (e.g., creating clients, compiling patterns).
+- Use `@OnDisabled` for cleanup (e.g., closing clients, releasing resources).
+
+
+## General Patterns
+
+- Use `ComponentLog` (obtained via `getLogger()`) for all logging, not SLF4J directly.
+  This ensures log messages are associated with the component instance and that they generate Bulletins.
+- Use `@CapabilityDescription` to provide a clear and concise description of what the component does. This should not
+  be used for configuration details.
+- Use `@Tags` to provide relevant keywords that help users find the component.
+- Use `@SeeAlso` to reference related components.
+- Use `@WritesAttributes` and `@ReadsAttributes` to document which FlowFile attributes are read and written by the component.
+- Use `@DynamicProperty` to document any dynamic properties supported by the component.