Cassandra 3.2 Overview
The 3.0 release of Apache Cassandra marked an important milestone. One of the biggest updates was CASSANDRA-8099, the JIRA to modernize the storage engine. It was also the first release in the new Tick Tock cycle, which lands a new release of Cassandra every month. Even .x numbers (such as 3.2) are feature releases, and odd .x numbers (such as 3.1) are bug fix releases. Cassandra 3.2, released about a week ago, is the first feature release following 3.0. This post will briefly cover the changes.
Better JBOD support
CASSANDRA-6696 improves JBOD in Cassandra by distributing data to disks based on token range rather than randomly. This should decrease the impact of disk failure by isolating failure to specific token ranges on a machine rather than all the token ranges that the machine is responsible for.
There’s a more thorough blog post on the Datastax Developer Blog about the improvements in JBOD. There’s also a follow up JIRA, CASSANDRA-10540, that will partition data on each disk based on token range, which will hopefully improve data density among other things.
CASSANDRA-9428 has been added, allowing for user defined compression (including encryption) to work with hints. While it may seem minor at first, compression can make a big difference when writing to spinning disks, and encryption is often necessary with financial data, so this can end up being a big deal for a lot of users.
Improvements to index building
Improvements to aggregation functions
Casting has been added, making user defined aggregations significantly more useful. Previously taking the avg() of 1 and 2 will yield 1, since the output type matches the input type (similar to Oracle and SQL Server).
cqlsh:test> create table jon ( id int, val int, ts timestamp, primary key (id, val)); cqlsh:test> insert into jon (id, val) values (2, 1); cqlsh:test> insert into jon (id, val) values (1, 2); cqlsh:test> select avg(val) from jon; system.avg(val) ----------------- 1
CASSANDRA-10310 adds support for CAST() which allows us to get results back in whatever type works best for us.
reach out if you're interested in working with me, and I'll be happy to schedule a free one-hour consultation.
cqlsh:test> select avg(CAST(val as float)) from jon; system.avg(cast(val as float)) -------------------------------- 1.5