Process Group Settings: Execution Engine, FlowFile Concurrency, and Outbound Policy

Modified on Thu, 11 Dec, 2025 at 2:31 PM

Summary

Process group settings control how work moves through a group: the execution mode, how input ports admit FlowFiles, and when FlowFiles exit the group. These settings matter when you need strict ordering, batching, dependency control, or “run everything, then trigger the next step” behavior.

1. Execution Engine

Defines how the process group executes internally.

Inherited

Uses the execution engine of the parent process group.
Most groups should use this unless there’s a specific reason to override it.

Standard (Stateful)

Default mode.
Processors behave normally, retain state across runs, and use standard scheduling rules.

Recommended for most flows.

Stateless

Runs processors without keeping state.

Good for:

Pure function flows (input → output)
Microservice-like patterns

Not good for:

Stateful processors
Incremental tracking
Any logic that relies on persistent state

2. Process Group FlowFile Concurrency

Controls how input ports admit FlowFiles into the group.
It does not restrict how many FlowFiles can exist inside the group once admitted.

Three modes are supported:

Unbounded

The default.

Input ports admit FlowFiles as fast as downstream demand allows
Parallel processing
High throughput
Normal streaming behavior

Use this for most flows.

Single FlowFile Per Node

Only one FlowFile may be admitted into the process group at a time.

Behavior:

When the group is empty, an input port admits one FlowFile
While any FlowFile is inside the group (including in child groups), the group is “busy”
No additional FlowFiles are admitted until the group becomes empty again

Important details:

Inside the group, you can still split, branch, and generate FlowFiles
This setting only restricts incoming FlowFiles
Great for sequential or dependent operations

Use when:

You need “run-to-completion before starting next”
You must ensure no overlapping jobs
A downstream action must run only once after all upstream work completes

Single Batch Per Node

Admits all FlowFiles currently waiting in the input queue, but then stops admitting new FlowFiles until the entire batch has completed.

Behavior:

Input ports pull every available FlowFile in the queue
Those FlowFiles are processed as a unit
The group must become empty before the next batch is admitted

This does not control output timing — that is handled by Outbound Policy.

Use this mode when:

You want batch entry but still want parallel work within the group
The downstream stage must operate on a full batch
You do not want the batch size to be manually implemented with SampleFlowFile or other logic

This mode is rarely used but is valid for controlled batch ingestion.

3. Process Group Outbound Policy

Controls when FlowFiles leave the process group through output ports.

Two modes:

Stream When Available (default)

FlowFiles exit the group as soon as they reach an output port.

Standard streaming behavior
Downstream work can begin before upstream work is done
Best for most flows

Batch Output

No FlowFiles leave any output port until all FlowFiles inside the group have completed processing.

Completion means:

All FlowFiles reached some output port, or
All FlowFiles were terminated

Key behaviors:

If even one FlowFile is stuck in a funnel or unhandled path, nothing is released
You must ensure every relationship in the group (and children) ultimately routes to termination or an output port
Batch Output is ignored if FlowFile Concurrency = Unbounded
(Because inputs can keep entering, “batch completion” is undefined)

Use Batch Output when:

You need to run a downstream operation only after all upstream work finishes
You want clear batch boundaries without manually implementing them

4. Combining Concurrency + Outbound Policy

These two settings are often used together for dependency control.

Example Pattern: “Do all the writes, then trigger report generation once.”

Option A:

Concurrency = Single FlowFile Per Node
Outbound = Stream or Batch Output
Generate one FlowFile to the output port at the end

Option B:

Concurrency = Unbounded
Use SampleFlowFile with a correlation attribute to release exactly one trigger FlowFile after a batch completes

Be careful:

Sending multiple FlowFiles to an output port will trigger downstream logic multiple times
Batch Output will hold everything until the last FlowFile finishes — watch for forgotten funnels or unhandled relationships

5. Ports Are Required Between Process Groups

A critical rule:

Input and Output Ports are required to move FlowFiles between process groups.

You cannot:

Connect processors across group boundaries
Draw lines into or out of nested groups
Bypass ports to move data

Ports enforce:

Clear flow boundaries
Versioned deployment
Isolation between modules

This is why concurrency and outbound policy operate relative to ports, not processors.

6. Quick Rules to Remember

Execution Engine = how the group runs (stateful vs stateless).
Concurrency = how FlowFiles enter via input ports.
Outbound Policy = when FlowFiles are released.
Single FlowFile Per Node enforces sequential batches.
Single Batch Per Node pulls all queued inputs at once.
Batch Output waits until all internal work is done before releasing.
Batch Output is ignored when Concurrency = Unbounded.
Ports are mandatory between process groups.

Documentation Reference

Execution Engine, FlowFile Concurrency, and Outbound Policy are explained in detail in the Clockspring User Guide.
You can right-click any process group and select View Documentation to open the in-tool docs panel, or view the full User Guide here:

https://www.clockspring.net/documentation/latest/user-guide.html

Process Groups and Scoping Overview
Process Group Ports: Input, Output, and When to Use Them
Parameter Contexts and Scoping
Version Control and Exporting Process Groups