Replication Issues - Duplicate Key Errors (1062)

We were using replication to deal with certain queries that were producing table scans. I realize this is not a great long term solution but we were migrating a web site that was set up this way, so it wasn’t really a choice.

We had a database that was a mix of InnoDB and a few MyISAM tables. The MyISAM tables were used for fulltext searches.

To get a database dump, we were using the command mysqldump --all-databases --single-transaction --master-data=1 > dumpfile.sql

We’d then import the dump into a DB slave. When we’d bring a slave up and reply the log, we’d get a duplicate key error. After a few times, we noticed it was always on the MyISAM table. This is because MyISAM does not use transactions, and –single-transaction does not place a read lock on tables. Data can be inserted into the table during the backup but before that table is dumped.

In conclusion, if you’re using –single-transaction and you’ve got MyISAM tables, you could be inserting the row while you’re doing the dump, when when the log is replayed you’ll get an error, because the row is already there.

From the manual:

–single-transaction

This option issues a BEGIN SQL statement before dumping data from the server. It is useful only with transactional tables such as InnoDB and BDB, because then it dumps the consistent state of the database at the time when BEGIN was issued without blocking any applications.

When using this option, you should keep in mind that only InnoDB tables are dumped in a consistent state. For example, any MyISAM or MEMORY tables dumped while using this option may still change state.

The –single-transaction option and the –lock-tables option are mutually exclusive, because LOCK TABLES causes any pending transactions to be committed implicitly.

While a –single-transaction dump is in process, to ensure a valid dump file (correct table contents and binary log position), no other connection should use the following statements: ALTER TABLE, DROP TABLE, RENAME TABLE, TRUNCATE TABLE. A consistent read is not isolated from those statements, so use of them on a table to be dumped can cause the SELECT performed by mysqldump to retrieve the table contents to obtain incorrect contents or fail.

This option is not supported for MySQL Cluster tables; the results cannot be guaranteed to be consistent due to the fact that the NDBCluster storage engine supports only the READ_COMMITTED transaction isolation level. You should always use NDB backup and restore instead.

To dump large tables, you should combine this option with –quick.

This error was brought up in this bug report with mysql.

If you found this post helpful, please consider sharing to your network. I'm also available to help you be successful with your distributed systems! Please reach out if you're interested in working with me, and I'll be happy to schedule a free one-hour consultation.