Compaction Strategies, Performance, and Their Impact on Cassandra Node Density

This is the third post in my series on optimizing Apache Cassandra for maximum cost efficiency through increased node density. In the first post, I examined how streaming operations impact node density and laid out the groundwork for understanding why higher node density leads to significant cost savings. In the second post, I discussed how compaction throughput is critical to node density and introduced the optimizations we implemented in CASSANDRA-15452 to improve throughput on disaggregated storage like EBS.

Now, we’ll dive into another critical aspects of Cassandra’s options that affects node density: compaction strategy. I’ve spent years analyzing the impact of different compaction approaches, and I’ve found time after time, picking the right strategy will dramatically impact performance, especially when coupled with efficient data modeling and a properly tuned system.

At a high level, these are the leading factors that impact node density:

  • Streaming Throughput
  • Compaction Throughput and Strategies (this post)
  • Repair
  • Query Throughput
  • Garbage Collection and Memory Management
  • Efficient Disk Access
  • Compression Performance and Ratio
  • Linearly Scaling Subsystems with CPU Core Count and Memory

Understanding Compaction’s Impact on Node Density

Compaction is Cassandra’s process of merging data from multiple SSTables into new SSTables. Sometimes this results in larger files, sometimes smaller files. This process is essential for maintaining read performance, reclaiming disk space from deleted data, and enforcing TTLs. However, the strategy you choose and how you configure it has profound implications for:

  1. Read Performance: How quickly your queries can retrieve data
  2. Write Amplification: How much additional I/O is generated from writes
  3. Space Amplification: How efficiently your data is stored
  4. Resource Consumption: How much CPU time and disk I/O is consumed by maintenance operations
  5. Node Scalability: The maximum amount of data a node can efficiently handle

When compaction is configured sub-optimally, it becomes a bottleneck that prevents nodes from handling more data. Even if you have plenty of disk space available, poorly configured compaction can lead to:

  • Constant background I/O consuming excessive system resources
  • Read latency spikes as queries have to check numerous SSTables
  • Excessive temporary space usage during compaction operations
  • Operational problems during repairs, streaming, and scaling operations

The Evolution of Compaction Strategies

Cassandra has historically offered three main compaction strategies, each with distinct characteristics and trade-offs:

Size Tiered Compaction Strategy (STCS)

The default strategy merges similarly-sized SSTables into larger ones. STCS initially offers excellent write performance, but will eventually lead to very large SSTables. This can result in the following issues:

  • Larger SSTables get compacted infrequently, because STCS requires multiple SSTables of similar size.
  • They can’t take advantage of the streaming fast path. I covered this in detail in my previous post
  • Higher read latency as queries may need to check many SSTables. See SSTables Per Read metric in nodetool tablehistograms
  • Inefficient space reclamation for deleted data

Leveled Compaction Strategy (LCS)

Introduced to improve read performance, LCS organizes SSTables into levels. SSTables are created at a mostly fixed size, and the strategy guarantees that no overlapping SSTables exist in the same level. This means a partition is guaranteed to only exist in 1 SSTable per level, reducing the number of files needed for reads. Each level increases the number of SSTables in that level and thus the total space per level.

LCS provides:

  • Better read performance than STCS through reduced SSTables per query
  • More efficient space reclamation
  • Smaller, more manageable SSTable sizes

However, it comes at the cost of significantly increased read and write amplification during compaction, making it less suitable for write-heavy workloads.

TimeWindowCompactionStrategy (TWCS)

Designed specifically for time series data, TWCS groups SSTables by time windows and only compacts within those windows. I’ve written extensively and shared videos about how TWCS can help time series systems scale.

TWCS excels at:

  • Efficiently managing time-partitioned data
  • Reducing compaction overhead for time series workloads
  • Making TTL management more efficient

It’s possible to run pretty dense nodes with TWCS, especially if you arent’ repairing your data, as long as you aren’t adding nodes to the cluster often. This is because within a window TWCS uses STCS, so we find ourselves hitting the slow streaming path. TWCS works well if your topology is fairly static, but if you’re growing a business, this can be a dangerous assumption.

For years, choosing a compaction strategy meant accepting significant trade-offs. However, with Cassandra 5.0, we gained a powerful new option that changes everything.

UnifiedCompactionStrategy: The Game Changer

The introduction of UnifiedCompactionStrategy (UCS) in Cassandra 5.0 represents a massive paradigm shift in how we approach compaction. UCS offers the best aspects of all previous strategies while eliminating many of their downsides.

The key insight behind UCS is that it uses the density of data as the primary signal for how SSTables should be grouped, rather than relying on size (STCS) or levels (LCS). This density-based approach effectively combines the best aspects of both strategies while avoiding their respective drawbacks. By using density as our signal, we can emulate the workload pattern of STCS (tiering) or LCS (leveling), while maintaining the property of manageable SSTable sizes, which is 1GB by default.

Here’s why UCS is transformative for node density:

  1. Controlled SSTable Sizes: Unlike STCS, which can create extremely large SSTables, UCS gives you fine-grained control over SSTable sizing
  2. Improved Read Performance: By organizing data more efficiently, UCS reduces the number of SSTables that must be checked during reads
  3. Reduced Write Amplification: UCS minimizes the amount of rewritten data during compaction operations
  4. Better Space Management: More efficient handling of temporary space during compaction operations
  5. Improved Streaming Performance: As discussed in the previous post, smaller, controlled SSTable sizes dramatically improve streaming operations

These benefits directly translate to higher possible node density and lower operational costs.

As as additional benefit, making changes to the UCS configuration parameters will not result in as much compaction as you would see switching from STCS to LCS.

Migrating to UCS

If you’re currently using older compaction strategies, here’s how to migrate to UCS. These are deliberately simple examples. You can use cassandra-easy-stress (formerly easy-cass-stress, now officially part of the project) and easy-cass-lab to determine the optimal strategy for almost any workload. I’ve updated the project documentation with these examples as well.

From SizeTieredCompactionStrategy (STCS)

The scaling_parameters option allows us to configure the behavior of UCS. Using the T4 option gives us 4 SSTables per tier and emulates the behavior of STCS while avoiding massive SSTables. Here’s what that looks like:

ALTER TABLE mykeyspace.foo 
    WITH COMPACTION = 
        {'class': 'UnifiedCompactionStrategy', 
         'scaling_parameters': 'T4'};

One thing to note is that while STCS is still the default strategy, it’s almost always the wrong choice prior to 5.0. Now that 5.0 is released, there are exactly zero times I would recommend it. At this point, I consider it to be always the wrong choice.

From LeveledCompactionStrategy (LCS)

As Cassandra’s expanded in functionality it gets used more and more as a general purpose database where read performance matters, and LCS has really shined for the project for a solid decade. Emulating the tiering behavior with UCS is possible by using the L parameter. Using L10 gives us the same 10 SSTables per level behavior as LCS:

ALTER TABLE mykeyspace.foo 
    WITH COMPACTION = 
        {'class': 'UnifiedCompactionStrategy', 
         'scaling_parameters': 'L10'};

My experience with replacing LCS with UCS has been overwhelmingly positive, and I would move every workload currently using LCS to UCS immediately. Much like STCS, I can’t think of a use case where LCS would be a better choice.

From TimeWindowCompactionStrategy (TWCS)

For time series data, you’ll need a more detailed approach. Rather than using a single table with TWCS, consider a multi-table approach with UCS. I’ve covered this in my Massively Scalable Time Series video, so I won’t go into all the advantages of doing that here. The main takeaway here is that UCS can replace TWCS. Here’s one potential setting:

ALTER TABLE mykeyspace.foo WITH COMPACTION = {
  'class': 'UnifiedCompactionStrategy',
  'scaling_parameters': 'T4,T8',
  'base_shard_count': '8'
};

This is a little more complex than the previous examples. Let’s break down the new parameters:

  • 'scaling_parameters': 'T4,T8': This tells UCS to use tiering, but starting in our second level, we’ll allow 8 SSTables instead of 4. This allows us to compact more aggressively in the first level, but allow the data to fan out once we hit the second. It reduces compaction overhead, which is good for tables with very high write throughput. It may not be a great fit if you have mixed reads and writes with a tight SLA. In that case, you’d be much better off served by multiple tables + UCS with L10 or L8. To get the optimal settings, you’ll still need to do some testing.
  • 'base_shard_count': '8': UCS shares the on disk data. By using more shards, we can perform more parallel compactions and scale write heavy workloads more easily.

We can also use unsafe_aggressive_sstable_expiration: true which allows Cassandra to drop fully expired SSTables without performing any safety checks. Powerful, but use with caution.

Monitoring and Measuring Compaction Performance

Once you’ve changed your table’s compaction strategy, you should ensure everything is performing optimally. Monitor these key metrics:

  1. p99 Read Latency: If you have an SLO, first make sure you’re meeting your latency targets.
  2. SSTables Per Read: A direct measure of read efficiency.
  3. Pending Compactions: Should be close to zero during normal operations
  4. Compaction Throughput: How quickly compaction tasks complete

Regularly review these metrics and adjust your compaction configuration as needed.

Conclusion

Compaction strategy selection and configuration is perhaps the single most important factor affecting Cassandra node density after streaming efficiency. By migrating to UnifiedCompactionStrategy and optimizing its configuration for your specific workload, you can dramatically increase the amount of data each node can efficiently handle.

As I mentioned above, I recommend UCS for all workloads. There isn’t a single use case where I would choose a different strategy. As Cassandra 5.0 becomes more widely adopted, I expect UCS to become the standard approach for most production deployments.

In the next post, we’ll take a look at repair and examine how it affects node density.

Remember, UCS is only available with Cassandra 5.0 or later. If you’re still running older versions, upgrading should be a priority on your roadmap!

If you found this post helpful, please consider sharing to your network. I'm also available to help you be successful with your distributed systems! Please reach out if you're interested in working with me, and I'll be happy to schedule a free one-hour consultation.