Why Clockspring Has High CPU but Low Throughput

Modified on Fri, 12 Dec, 2025 at 2:25 PM

One of the most confusing situations in Clockspring is seeing CPU near 100 percent while very little data is actually moving.

This feels wrong, but it is a common and explainable pattern. High CPU with low throughput almost always means the system is busy dealing with pressure, not doing useful work.

This article explains where that pressure comes from.

The Key Idea

High CPU does not mean high progress.

CPU can be consumed by:

garbage collection
thread contention
retries
blocked I/O
coordination overhead

In these cases, the system is working hard just to stay afloat.

Most Common Causes

Disk I/O Bottlenecks

Disk is one of the most common hidden constraints.

When disks are slow or overloaded:

threads block waiting on reads or writes
context switching increases
CPU burns cycles coordinating stalled work
throughput collapses

This is common when:

content repositories are under pressure
disks are shared with other workloads
storage latency increases under load

CPU stays high, but data barely moves.

Garbage Collection Dominating Runtime

GC can consume large amounts of CPU while doing no useful processing.

Symptoms:

CPU spikes in repeating cycles
throughput drops during spikes
heap usage does not stabilize
GC frequency increases over time

From the outside, it looks like the system is busy but stuck.

Thread Starvation and Contention

Clockspring relies on thread pools.

If threads are:

blocked on slow downstream systems
waiting on disk I/O
oversubscribed due to high concurrency

Then:

CPU is spent managing contention
little real work is completed

More threads can make this worse, not better.

Downstream Systems Are the Bottleneck

Clockspring can only move data as fast as downstream systems allow.

If a database, API, or filesystem is slow:

retries increase
queues fill
processors wake up frequently
CPU usage climbs

Throughput drops because the bottleneck is external.

Tight Retry or Polling Loops

Some flows unintentionally create busy loops.

Examples:

aggressive retry logic with no backoff
short scheduling intervals on empty queues
rapid polling of unavailable endpoints

These patterns can consume CPU without moving data.

Why Increasing CPU or Heap Doesn’t Help

In this situation:

CPU is not the limiting factor
memory is not the limiting factor

The limiting factor is usually:

disk latency
downstream capacity
flow design

Adding more CPU or heap just allows the system to struggle harder.

What to Check First

When CPU is high but throughput is low, check in this order:

Disk usage and I/O latency
Queue sizes and backpressure activity
Garbage collection behavior
Downstream system health
Concurrency settings

These usually reveal the real bottleneck quickly.

Common Mistakes

Adding more threads to “push harder”
Increasing heap without reducing pressure
Restarting nodes repeatedly
Scaling out before fixing the bottleneck

These often amplify the problem.

Summary

High CPU with low throughput means Clockspring is busy managing pressure, not processing data.

The root cause is usually:

disk I/O limits
garbage collection pressure
thread contention
slow downstream systems