TWCS Part 2 - Using Before Cassandra 3.0

In our first post about TimeWindowCompactionStrategy, Alex Dejanovski discussed use cases and the reasons for its introduction in 3.0.8 as a replacement for DateTieredCompactionStrategy. In our experience switching production environments storing time series data to TWCS, we have seen the performance of many production systems improve dramatically.

The examples Alex gives for making use of TWCS work great for recent versions of Cassandra. However, a significant number of users are still using 2.0, 2.1, and 2.2. If you’re in this group, you can still use TWCS, but it’ll require a little extra work. Let’s take a look at how to achieve this.

First, you’ll need to clone Jeff Jirsa’s TWCS repo:

git clone https://github.com/jeffjirsa/twcs/
cd twcs/

Since compaction changes slightly between versions, you need to checkout a version of TWCS that corresponds to your version of Cassandra. For instance:

git checkout -t origin/cassandra-2.0

If you’re building for 2.1, you’ll need to cd one level deeper in order to build it:

cd twcs/TimeWindowCompactionStrategy

Then build the project

mvn package

This will generate a JAR in your target directory. For example, in executing the above, I’ve built TWCS for Cassandra 2.1:

jhaddad@rustyrazorblade ~/dev/twcs$ ls target/*jar
target/TimeWindowCompactionStrategy-2.1.12.jar

Move that JAR into your Cassandra lib directory. You should be able to restart, and the JAR will automatically be added to your CLASSPATH, meaning it’s available to use. On my laptop, it looks like this:

mv target/TimeWindowCompactionStrategy-2.1.12.jar ~/dev/apache-cassandra-2.1.16/lib

With TWCS installed, you can now switch your tables to use it via an ALTER TABLE command. To demonstrate, I’ve loaded up the killrweather dataset using cdm. The raw_weather_data table looks like a good candidate. If we want to group our data into windows of 12 hours, we can do the following:

alter table raw_weather_data WITH compaction= {
    'compaction_window_unit': 'HOURS',
    'compaction_window_size': '12',
    'class':'com.jeffjirsa.cassandra.db.compaction.TimeWindowCompactionStrategy'
 };

If you’re on the wary side, you can change a canary node to TWCS using JMX to see how it behaves. In general this is a good way of testing a compaction strategy change in a production environment. Be aware not all settings may be exposed via JMX in every version of Cassandra.

To demonstrate, I’ll be using jmxterm. From your shell, open up jmxterm:

java -jar jmxterm-1.0-alpha-4-uber.jar

You’ll get a nice prompt. Here you can change the domain and the bean we’ll be using:

Welcome to JMX terminal. Type "help" for available commands.

$>open localhost:7199
#Connection to localhost:7199 is opened

$>domain org.apache.cassandra.db
#domain is set to org.apache.cassandra.db

$>bean columnfamily=raw_weather_data,keyspace=killrweather,type=ColumnFamilies
#bean is set to org.apache.cassandra.db:columnfamily=raw_weather_data,keyspace=killrweather,type=ColumnFamilies

Next, you’ll set the compaction strategy and parameters (2.1.9 and up only). Take note of the double quotes in the JSON. Compaction params over CQL are single quoted, over JMX the JSON must be double quoted:

$>set CompactionParametersJson {"class":"com.jeffjirsa.cassandra.db.compaction.TimeWindowCompactionStrategy","compaction_window_unit":"HOURS","compaction_window_size":"12"}
#Value of attribute CompactionParametersJson is set to {"class":"com.jeffjirsa.cassandra.db.compaction.TimeWindowCompactionStrategy","compaction_window_unit":"HOURS","compaction_window_size":"12"}

Unfortunately if you’re using a version of Cassandra before 2.1.9, you won’t be able to set compaction parameters this way. You will only be able to set the strategy. This makes it more difficult to test. We recommend upgrading to at least Cassandra 2.1 to change compaction strategies via JMX. If you absolutely must test TWCS via JMX with an older version, you can change the defaults from 1 DAYS to whatever you need by tweaking the defaults in the source, recompiling, deploying, and then switching the strategy:

$>set CompactionStrategyClass com.jeffjirsa.cassandra.db.compaction.TimeWindowCompactionStrategy
#Value of attribute CompactionStrategyClass is set to com.jeffjirsa.cassandra.db.compaction.TimeWindowCompactionStrategy

At this point, you should be comfortable with when to use TWCS and how to put it into production for version of Cassandra that it didn’t ship with. Using JMX you have the ability to switch individual nodes to TWCS to test it’s impact on your production cluster before running the ALTER TABLE statement. Running TWCS in prod already, have questions, or find this post useful? Please leave a comment!

If you found this post helpful, please consider sharing to your network. I'm also available to help you be successful with your distributed systems! Please reach out if you're interested in working with me, and I'll be happy to schedule a free one-hour consultation.