MySQL is a popular choice for new projects. It’s a flexible database that’s easy to set up and start querying. There’s loads of documentation, examples and frameworks it works with, such as Wordpress, Pandas, Ruby on Rails, and Django. From the above paragraph it reads like a pretty fantastic database, and at small scale it can be great. The problem arises when you need to scale past a single server or have high availability needs.
- I have grown increasingly frustrated with the world as people have become more and more convinced that “schema-less” is actually a feature to be proud of (or even exists). For over ten years I’ve worked with close to a dozen different databases in production and have not once seen “schemaless” truly manifest. What’s extremely frustrating is seeing this from vendors, who should really know better. At best, we should be using the description “provides little to no help in enforcing a schema” or “you’re on your own, good luck.
- Amazon announced high I/O instances today. This is huge for anyone with a database larger than available memory, as it’s been a complete nightmare dealing with EBS up till now. Now your Cassandra, MongoDB, MySQL, or whatever your using should be able to perform well without requiring keeping your entire dataset in memory. With each instance you get 2x1TB of disk. In this tutorial I’ll be setting it up as a RAID0 to get a single 2TB disk which should deliver excellent performance.
- I decided to take a look at Drizzle today and was encouraged by what I saw. Here’s my favorite part: There is no UNSIGNED (as per the standard). * There are no spatial data types GEOMETRY, POINT, LINESTRING & POLYGON (go use Postgres). * No YEAR field type. * There are no FULLTEXT indexes for the MyISAM storage engine (the only engine FULLTEXT was supported in). Look at either Lucene, Sphinx, or Solr.
- Splitmytab.net is finally for the public to check out. Splitmytab is a bill splitting and IOU system for friends. It uses facebook’s login, so you won’t need to put in anyone’s emails, names, or get people to sign up for an account. It’ll automatically keep balances of who owes who, so you can keep a running tab with friends and always know who’s buying the next case of beer. Please note: I’m not a designer, so there’s a few rough corners, but what’s there is simple and it works.
- I ran into an issue installing the MySQLdb module. >>> import MySQLdb /Library/Python/2.7/site-packages/MySQL_python-1.2.3-py2.7-macosx-10.7-intel.egg/_mysql.py:3: UserWarning: Module _mysql was already imported from /Library/Python/2.7/site-packages/MySQL_python-1.2.3-py2.7-macosx-10.7-intel.egg/_mysql.pyc, but /Users/jhaddad/Downloads/MySQL-python-1.2.3 is being added to sys.path Traceback (most recent call last): File "", line 1, in File "MySQLdb/__init__.py", line 19, in import _mysql File "build/bdist.macosx-10.7-intel/egg/_mysql.py", line 7, in File "build/bdist.macosx-10.7-intel/egg/_mysql.py", line 6, in __bootstrap__ ImportError: dlopen(/var/root/.python-eggs/MySQL_python-1.2.3-py2.7-macosx-10.7-intel.egg-tmp/_mysql.so, 2): Library not loaded: libmysqlclient.18.dylib Referenced from: /var/root/.python-eggs/MySQL_python-1.2.3-py2.7-macosx-10.7-intel.egg-tmp/_mysql.so Reason: image not found I fixed it by doing the following:
- Over three years ago I wrote about how you cannot use a stored procedure in a subquery. Well, it’s 2010, and I’m still annoyed by this and a handful of other things. I was just working today on a report consisting of a series of queries, taking about a minute to generate. Some of the data would be created in a temporary table and queried against multiple times for performance reasons, and ultimately spit out into a CSV file for someone to examine later.
- Looking for instances particular column in a large schema can be a pain. Fortunately the information schema makes this pretty easy, if your columns have a consistent naming convention. SELECT table_schema, table_name, column_name FROM information_schema.columns WHERE column_name LIKE '%some_name%'; Now, if we want to wrap this up into an easy to use stored procedure, we can do something like this: drop procedure find_column; delimiter // CREATE PROCEDURE find_column(c varchar(255)) begin SET @a = CONCAT("%", c, "%"); SELECT table_schema, table_name, column_name, column_type FROM information_schema.
- I don’t remember having to do this before, but I had compiled MySQL 5.1.34 recently on my dev box (os x) and I saw a warning on a create table statement. It turns out InnoDB was not enabled (or even listed in the list of storage engines. Before: mysql> show engines; +————+———+———————————————————–+————–+——+————+ | Engine | Support | Comment | Transactions | XA | Savepoints | +————+———+———————————————————–+————–+——+————+ | CSV | YES | CSV storage engine | NO | NO | NO | | MRG_MYISAM | YES | Collection of identical MyISAM tables | NO | NO | NO | | MEMORY | YES | Hash based, stored in memory, useful for temporary tables | NO | NO | NO | | MyISAM | DEFAULT | Default engine as of MySQL 3.
- Peeping into memcached. Really interesting read about how to examine what’s stored in memcached. Peep uses ptrace to freeze a running memcached server, dump the internal key metadata, and return the server to a running state. If you have a good host ejection mechanism in your client, such as in the Twitter libmemcached builds, you won’t even have to change the production server pool. The instance is not restarted, and no data is lost.