UNDERSTANDING THE NUANCE OF COMPACTION IN APACHE CASSANDRA

Compaction in Apache Cassandra isn’t usually the first (or second) topic that gets discussed when it’s time to start optimizing your system. Most of the time we focus on data modeling and query patterns. An incorrect data model can turn a single query into hundreds of queries, resulting in increased latency, decreased throughput, and missed SLAs. If you’re using spinning disks the problem is magnified by time consuming disk seeks.