After almost five years of using Pelican as my static site generator, I’ve migrated to the Hugo tool. While I enjoyed Pelican and it’s flexibility, it’s performance started to bother me when building a site from scratch. Depending on what else was running on my laptop, a full build could take 15-20 seconds. This isn’t the end of the world, but in comparison Hugo takes less than 100 milliseconds. If it was simply a matter of build time, I may not have really cared that much, but I’ve been using Hugo to build the site and documentation for Reaper, the open source repair tool we maintain at The Last Pickle.
- Edit: The source code for this post is located on GitHub Sometimes when I travel I end up trying to learn something completely new. For a while I was playing with Rust, Capn Proto, Scala, or I’d start a throwaway project at an airport and just tinker. My passion is and has always been databases. I’ve maintained this blog for roughly a decade, starting with MySQL for the first part of my career but moving to Apache Cassandra several years ago, and am now a committer and member of the PMC.
- If you were to take a look at my blog, you’d think I’d flipped a table and left the tech industry. Not the case at all. I’m still writing, but less frequently, and on the TLP blog. I intend to start writing here again, but the material will likely focus around topics other than Cassandra, since I’m already writing about it elsewhere. Here are the posts I’ve authored in the last 6 months or so:
- Instacluster announced on the Apache Cassandra user list that they are making their supported branch of the Cassandra 3.7 tick tock release publicly available (see GitHub repo). Bug fixes that go into 3.8, 3.9, etc will be back ported to the Instacluster LTS. You can read the blog post about the decision. Some people I’ve talked to are concerned about having different commercial entities doing long term supported releases, and this concern is understandable.
- I haven’t blogged in a while, which is a bummer because I was determined to write an article a week for the entire year. I haven’t even come remotely close to that goal. I’ve recently switched jobs from DataStax to Consulting with The Last Pickle, which has been pretty hectic. Add to that 3 presentations at the Cassandra Summit and the end result is very little time for personal projects.
- I’ve spent the last 4 years working in the big data world with Cassandra because it’s the only practical solution if you have a requirement to scale out, uptime is a priority, and you need predictable performance. I’ve heard different ways of describing where Cassandra fits in your architecture, but I think the best way to think of it is close to your customer. Think of the servers your mobile apps communicate with or what holds your product inventory.
- One of the problems of learning a new database is getting used to a new way of data modeling. PostgreSQL looks different from Redis, which is different from a graph, and is different from Cassandra. Cassandra Dataset Manager aims to reduce the time spent in a frustrating trial and error process trying to learn proper data modeling techniques for Apache Cassandra and Datastax Enterprise by providing curated data models which have been designed by professionals with years of experience.
- I posted a short preview showing off some of the work I’ve been doing recently on Cassandra Dataset Manager, a tool to help new Cassandra users learn how to create proper data models. There’s documentation, but it’s still under heavy development.
- Apache Cassandra 3.3 was released last week. As per the Tick Tock release schedule, this release is focused on bug fixes and no new features were introduced. For practical purposes, consider this a bug fix release to Cassandra 3.2. All told there were almost 50 bugs fixed in this release. Many of the bugs fixed in this version also applied to Cassandra 3.0.3, which was also released last week. With any Cassandra release, it’s a good idea to read the Changelog and News before upgrading.
- If you’ve looked into using Cassandra at all, you probably have heard plenty of warnings about its secondary indexes. If you’ve come from a relational background, you may have been surprised when you were told to create multiple tables (materialized views) instead of relying on indexes. This is because Cassandra is a distributed database, and the impact of doing a query that hits your entire cluster is you lose your linear scalability.