Cassandra Secondary Index Preview #1

If you’ve looked into using Cassandra at all, you probably have heard plenty of warnings about its secondary indexes. If you’ve come from a relational background, you may have been surprised when you were told to create multiple tables (materialized views) instead of relying on indexes. This is because Cassandra is a distributed database, and the impact of doing a query that hits your entire cluster is you lose your linear scalability. If you’re capped at 25K queries per second per server, it doesn’t matter if you have one or a thousand servers, you’re still only able to handle 25k queries per second, total.

That said, there’s times when you could use secondary indexes. If you really don’t know every query you’re going to execute ahead of time, or you have many permutations of the same query, they can be really beneficial. Secondary indexes are also perfectly reasonable if you know your partition key in advance, restricting the query to a single server. Lastly, these indexes can be very helpful in analytics workloads (Spark batch jobs) where you don’t have an SLA that’s measured in milliseconds.

Sadly, secondary indexes in Cassandra have been relatively inflexible. Again, if your background is with relational databases, it might surprise you to learn that indexes Cassandra can only be used for equality queries (think WHERE field = value). Meaning you can’t perform range queries such as WHERE age > 18. Lastly, there isn’t a query optimizer that can handle merging statements like WHERE age > 18 and age < 30 into a single predicate, evaluate OR conditions, or evaluate complex nested conditionals.

Some technical details before we go on…

The existing implementation of secondary indexes uses hidden tables as its underlying data structure. This is nice because it allows for code reuse but problematic in that it’s not really the right tool for the job. Reading from a secondary index on a node looks like this:

  1. Find the value in the hidden table we’re looking for
  2. Find each of the keys in the other sstables we need to satisfy query results by going through the internal read path

Sadly, going through the normal internal read path to find each row means looking at Bloom filters and partition indexes. In contrast, in other databases indexes are typically represented as tree structures with pointers to location on disk. This allows for features like efficient range queries with minimal overhead.

Maintaining indexes through hidden tables means they are going through a separate compaction process. Because of this, we can’t point directly to a locations on disk. Independently compacting sstables and indexes means the location of the data and the index information are completely decoupled. If the data is compacted, a new sstable is written, and our index is now incorrect. This means we can’t simply (and efficiently) point to a location on disk in an index because the location of the data can change.

Introducing SASI

SASI is the abbreviated name for SSTable Attached Secondary Indexes. They’re called this for a very good reason. SASI works by generating an index for each sstable, instead of managing the indexes independently. This allows for an interesting optimization - the indexes can reference offsets in the data file, rather than having to only reference keys. This means we can skip looking at bloom filters and partition indexes and go straight to our data which we know must be there. When sstables are compacted, a new index will be generated as well.

The SASI indexes are also not implemented as sstables. Instead, they are implemented as memory mapped B+Trees, which are an efficient data structure for indexes. This means we can easily get some nice features like range queries, which are often missed when coming from other databases.

A Quick Feature Comparison

I have some examples I’ve written using the Python driver. I’ve already done my imports and set up a keyspace that I’ll be using. I’m also using the Faker library to generate fake names and birth years.

I’ve created 2 tables, one with the old indexes and one with SASI.

table = """CREATE TABLE IF NOT EXISTS old_index (
            id uuid primary key,
            first_name text,
            last_name text,
            year_born int
            )"""
            
session.execute(table)

for field in ["first_name", "last_name", "year_born"]:
    session.execute("CREATE INDEX on old_index({})".format(field))

table = """CREATE TABLE IF NOT EXISTS sasi_index (
            id uuid primary key,
            first_name text,
            last_name text,
            year_born int
            )"""
            
session.execute(table)

for field in ["first_name", "last_name", "year_born"]:
    session.execute("""CREATE CUSTOM INDEX on sasi_index({}) 
            USING 'org.apache.cassandra.index.sasi.SASIIndex'
            """.format(field))

Here I insert 100 records into each table.

start = datetime.now() - timedelta(days=365*80)
end = datetime.now() - timedelta(days=365*20)

prep1 = """INSERT INTO old_index 
            (id, first_name, last_name, year_born) 
            values (?, ?, ?, ?)"""
prep2 = """INSERT INTO sasi_index 
            (id, first_name, last_name, year_born) 
            values (?, ?, ?, ?)"""

stmt1 = session.prepare(prep1)
stmt2 = session.prepare(prep2)

for i in range(100):
    data = [uuid.uuid4(), 
            faker.first_name(), 
            faker.last_name(), 
            faker.date_time_between_dates(start, end).year]
    
    session.execute(stmt1, data)
    session.execute(stmt2, data)

Let’s take a look at a simple query that will work on both tables, looking up all users born in 1981. It’s a simple equality search:

for x in session.execute("""select * from old_index 
                            where year_born = 1981 limit 3"""):
    print x.first_name, x.last_name, x.year_born

Declan Brakus 1981

The same query works with SASI, and we get the same results, as expected:

for x in session.execute("""select * from sasi_index 
                             where year_born = 1981 
                             limit 3"""):
    print x.first_name, x.last_name, x.year_born
    

Declan Brakus 1981

Above I mentioned range queries don’t work with existing indexes, let’s just be sure:

for x in session.execute("""select * from old_index 
                            where year_born > 1981 
                            limit 3"""):
    print x.first_name, x.last_name, x.year_born
InvalidRequestcode=2200 \[Invalid query\] message="No supported secondary index found for the non primary key columns restrictions"

Yikes, an exception with a stacktrace. This is kind of a bummer, we can’t use non-equality in our WHERE clauses with the old indexes. Let’s see how it works with SASI:

for x in session.execute("""select * from sasi_index 
                            where year_born > 1981 
                            limit 3"""):
    print x.first_name, x.last_name, x.year_born

Gilman Gottlieb 1995 Farrah Schowalter 1982 Janis Beahan 1985

Nice, we’ve verified SASI 2i works with inequalities.

In our RDBMS world, we usually have a LIKE clause available. LIKE normally scans entire text blocks for a string, using % as a wildcard. In Cassandra 3.4, LIKE has a slightly different behavior. It’s closer to MATCH AGAINST with MySQL, or the disgusting @@ / ts_vector / ts_query syntax in postgresql. LIKE in Cassandra allows us to search for indexed text, rather than doing some absurd full table scan across hundreds of billions of rows (hint: terrible idea).

for x in session.execute("""select * from old_index 
                            where first_name 
                            LIKE 'Jon%' 
                            limit 3"""):
    print x.first_name, x.last_name, x.year_born
InvalidRequestcode=2200 \[Invalid query\] message="first_name LIKE '<term>%' restriction is only supported on properly indexed columns"

OK, we kind of knew that would happen. You probably won’t be shocked to see SASI works with the LIKE keyword:

for x in session.execute("""select * from sasi_index 
                            where first_name LIKE 'J%' 
                            limit 3"""):
    print x.first_name, x.last_name, x.year_born

Janis Beahan 1985 Johny Schaefer 1957 Joyce McGlynn 1942

By default, the indexes that we create here are prefix indexes. There are other index types, CONTAINS and SPARSE. I’ll be covering those in a later blog post. Each index has options that can be provided to specify how it tokenizes and indexes fields, and if it is case sensitive or not.

When to use Secondary Indexes

Before you go running off throwing Secondary indexes on every field, it’s important to know that they still come at a cost. We haven’t changed the fact that querying a secondary index could mean querying almost every machine in your cluster, it’s just become a lot more efficient to do lookups. For frequently run queries, using materialized views (your own or managed by Cassandra) is a more efficient option. The same rules of Cassandra apply - model your tables to answer queries, not to satisfy some normal form.

In a later post, I’ll be examining SASI indexes in greater detail. It’s also likely some details will change along the way - this is a preview of a feature that’s about a month away from being released. I encourage you to clone the repo and build from trunk to try things out for yourself. I’m really looking forward to seeing the evolution of SASI indexes over the next few months.

Related Links

If you found this post helpful, please consider sharing to your network. I'm also available to help you be successful with your distributed systems! Please reach out if you're interested in working with me, and I'll be happy to schedule a free one-hour consultation.