When Clockspring slows down or crashes with an OutOfMemoryError, the most common reaction is to increase the JVM heap size.
Sometimes that buys time.
Most of the time, it does not fix the real problem.
What Increasing Heap Size Actually Does
Increasing heap size:
allows more objects to exist at once
delays garbage collection pressure
increases the time between Garbage Collection (GC) cycles
That’s it.
It does not:
fix inefficient flow design
reduce object churn
solve blocked downstream systems
prevent unbounded queues
eliminate memory leaks in custom logic
A bigger heap changes timing, not behavior.
Why Bigger Heaps Often Make Things Worse
Longer GC Pauses
Larger heaps take longer to scan and compact.
Result:
fewer GCs
but much longer stop-the-world pauses
increased risk of missed heartbeats
higher chance of node disconnects
This is why cluster instability sometimes gets worse after increasing heap.
Masking the Real Problem
A bigger heap can hide issues temporarily:
queues keep growing
memory pressure builds silently
the eventual failure is larger and harder to recover from
When it fails again, it fails harder.
Common Problems Heap Size Does Not Fix
Increasing heap does not solve:
excessive splitting of records
very large FlowFiles held in memory
large attributes carrying payload data
unbounded queues
aggressive retries
slow or unavailable downstream systems
disk I/O bottlenecks causing thread pileups
These are design and throughput problems, not heap size problems.
Why OOMs Usually Happen in Clockspring
Most OutOfMemoryErrors are caused by one or more of these:
too many FlowFiles in memory at once
large batches with no upper bounds
holding content in attributes
retry loops that never drain
downstream systems slowing while upstream keeps producing
Heap size only determines how long this takes to blow up.
When Increasing Heap Size Is Appropriate
Increasing heap can be valid when:
the flow design is sound
queues are bounded
downstream systems are healthy
memory usage stabilizes after GC
GC frequency is reasonable
In those cases, heap size tuning is optimization, not triage.
Better Questions to Ask First
Before touching heap size, ask:
Are queues growing or draining?
Is backpressure activating?
Are FlowFiles accumulating?
Are retries piling up?
Is disk I/O slow?
Is GC happening more frequently over time?
If the answers point to pressure, fix that first.
The Right Mental Model
Think of heap like a buffer.
Small buffer overflows quickly
Large buffer overflows later
Neither fixes a blocked drain
If data cannot move through the system, memory will eventually fill regardless of heap size.
Summary
Increasing heap size:
delays failure
increases GC pause risk
hides design problems
rarely fixes root causes
In Clockspring, stability comes from:
bounded queues
sane batch sizes
healthy downstream systems
controlled retries
balanced flow design
Fix the pressure. Then tune the heap.
Was this article helpful?
That’s Great!
Thank you for your feedback
Sorry! We couldn't be helpful
Thank you for your feedback
Feedback sent
We appreciate your effort and will try to fix the article