Cassandra Summit Recap: Performance Tuning and Cassandra Training

Hello, friends in the Apache Cassandra community!

I recently had the pleasure of speaking at the Cassandra Summit in San Jose. Unfortunately, we ran into an issue with my screen refusing to cooperate with the projector, so my slides were pretty distorted and hard to read. While the talk is online, I think it would be better to have a version with the right slides as well as a little more time. I’ve decided to redo the entire talk via a live stream on YouTube. I’m scheduling this for 10am PST on Wednesday, January 17 on my YouTube channel. My original talk was done in 30 minute slot, this will be a full hour, giving plenty of time for Q&A.

My talk at the Cassandra Summit was titled “Cassandra Performance Tuning Like You’ve Been Doing It for Ten Years.” I’ve spent a lot of time fixing performance problems in a lot of different environments you’ve probably interacted with many times on a regular basis. I’ve landed on a pretty useful set of observability tools I like to use, which I covered in this talk as well as my blog posts on profilers, flame graphs and bcc-tools. I also covered two of the useful methodologies I follow (USE Method and OODA Loop) that I find helpful to stay focused on making progress when solving problems rather than guessing.

During the last minute I had the stage, I misheard a question from one of the attendees. For context, during the talk I had spent a bit of time highlighting a common mistake when designing dashboards - relying on metrics that cast your system in a better light than your customers are experiencing. I have seen a lot of teams rely on averages and percentiles like p75, which is a great way to miss underlying problems. I misheard the question as “what if it’s a little slow?”, when he was really asking “what about Little’s Law?”, from queuing theory. This is a great question and I plan on addressing it in a future post, as the minute or so I had to answer it wouldn’t have been enough to do the question justice.

Production Ready Cassandra Training

Now for the big news! Later this year (Q2 2024), I’m launching a training program for Apache Cassandra in partnership with AxonOps! I chose AxonOps because they offer what I consider the best monitoring solution for Apache Cassandra – a critical component for any serious deployment, and will allow me to focus on creating high quality training material rather than reinventing an operational and monitoring platform. This program will focus on my learnings based on my experience working in large environments, leveraging the technical tooling of AxonOps. It’s designed to accelerate your learning curve, whether you’re just starting or looking to deepen your Cassandra expertise. This will be a hands-on course, teaching you the best practices at operating Cassandra, and how to solve problems you’ve never seen before. I’m really excited to deliver this. If you’re interested in this program, I’ll be opening up early registration in the near future to folks on my mailing list, so you’ll want to sign up if this sounds like something you want to be a part of.

2024 is going to be a great year for the Cassandra community! With the 5.0 release coming up delivering features such as SAI, Vector Search, better operational tools and improved performance, Apache Cassandra continues to be the best choice for always-on distributed databases with low latency requirements. I’m looking forward to meeting more of you and helping you on your Cassandra journey.

If you found this post helpful, please consider sharing to your network. I'm also available to help you be successful with your distributed systems! Please reach out if you're interested in working with me, and I'll be happy to schedule a free one-hour consultation.