MySQL Forks

Forks:
1. Oracle
2. Drizzle
3. MariaDB
4. Percona
5. Webscale
(these are the main ones at least)

MySQL is open-source.  People, usually former MySQL employees, have branched off of the open source MySQL codebase with their own versions, usually offering some degree of compatibility with the original product.  From my experience, MariaDB and PerconaDB are forks that can be considered drop in replacements for MySQL while adding functionality and performance.

MySQL (Oracle)

This is the original MySQL database as well as new development since Oracle’s purchase of the product. There is still a “community” edition however Oracle seems committed to differentiating community(free for non-embeded uses) and enterprise(the one you always pay for) licenses by concentrating its development of advanced features included in the enterprise version. The community edition is just as good if not better than its ever been however do not expect modern “big-data” style enhancements to come from Oracle for free in the future.  Basically, if you are dealing with 20 million rows or less in your largest tables and don’t expect to need advanced features such as clustering or raw backups, the original community edition of this product as Oracle offers it is a fine way to go.  Anything beyond this will require either an investment in Oracle Enterprise technology or consideration of another fork.

Drizzle

Drizzle is open source and was forked from the  6.0 development branch of the MySQL.  It has a client/server architecture and works with standard SQL.  Drizzle is distributed under the the GNU General Public License with certain drivers and such issued under the BSD license.

MariaDB

MariaDB is a fork designed to offer a higher performance drop in replacement for the MySQL product offered by Oracle. I would say it is best know for its advanced query optimizer which really speeds things up on certain types of queries without the user really having to worry about what has changed.  It was forked by some of the original developers of MySQL who were worried about how the Oracle acquisition would turn out.  The plan for Maria is to strive for as high of a level of compatibility as possible including all commands and functions.  While you can use InnoDB with Maria, the default storage engine is Percona’s XtraDB which offers quite a few performance enhancements and fixes for particular types of queries that InnoDB did not seem to like.

Percona Server

Percona Server is another fork aiming for high compatibility with official MySQL releases.  In addition, they have focused on performance enhancements and monitoring tools to aid in administration and query optimization.  Like MariaDB, Percona server uses XtraDB.  Percona generously offers for free lots of scaling, monitoring and backup tools rivalling Oracle’s enterprise level products which are, of course, not free.   My usual product choice as of late for a 1-node product is to install MariaDB with XtraDB tables and using Percona enterprise tools against it when needed for optimization and replication.

WebScaleSQL

WebScaleSQL is a branch developed in a partnership between Facebook, Google, LinkedIn and Twitter for the purpose of handling very large scale data sets horizontally scaled across ridiculous amounts of commodity hardware.   All of these companies being faced with similar challenges were wise to pool their efforts in the development of this branch.   WebScale is also GNU and you can access the source on github.

 

MySQL Engines

Engines:
1. MyISAM
2. InnoDB and XtraDB
3. Memory
4. Merge
5. Archive
6. Federated
7. NDB

MySQL supports several different types of Table Engines also known as “Table Types”. A database can have its tables being a mix of different table engine types or all of the same type.
Most common by far are MyISAM vs InnoDB
MyISAM is older and less sophisticated of the two.  In some ways its actually down right primitive, however, you I still run in to MyISAM tables a lot so it’s good to be familiar with them.
InnoDB improves on MyISAM by offering modern features such as:
1. It is fully transactional
2. Has automated crash recovery using logged transactions which we will discuss later
3. And, most importantly, (at least to me) row level locking. MyISAM is limited to table level locking.

 

MySQL uses a pluggable storage engine architecture that enables storage engines to be loaded into and unloaded from a running MySQL server.

The default storage engine for MySQL as of 5.5 was InnoDB.  It is ACID compliant which basically means that it is transactionally safe.  It has configurable rollback as well as crash recovery capability.  It functions much better than earlier architectures such as MyISAM in dw and reporting uses due to it’s non-locking reads.  It supports foreign key constraints which is of course important (just don’t over-do them because they are also a pain) and uses clustered indexes speed up queries based on primary keys.

Percona XtraDB is an enhanced version of the InnoDB storage engine, designed to better scale on modern hardware, and including a variety of other features useful in high performance environments. It is completely backwards compatible with InnoDB.  I have yet to run in to an issue in this regard and would swap engines in production without even running a bunch of tests.  It just works.  Like I mentioned before, XtraDB is the default in Percona and Maria.   However, you can still use InnoDB if you want to in Maria (not sure about Percona)

To skip XtraDB in MariaDB add this to my.cnf

[mysqld]

ignore_builtin_innodb

plugin_load=innodb=ha_innodb.so

# The following should not be needed if you are using a mariadb package:

plugin_dir=/usr/local/mysql/lib/mysql/plugin

MyISAM: The default storage engine prior to MySQL 5.5.  If you run in to MyISAM tables, you might want to consider upgrading them.  You can do this a table at a time as you can run MyISAM and InnoDB/XtraDB tables in the same database.  In some cases, MyISAM actually seems to perform better, however this is not usually the case (table level locking yuck) and furthermore, when things go wrong with MyISAM, they can really go wrong.  Minor corruption/failure events which more modern engines handle in stride will send you searching for backups with MyISAM.

Memory Engine: This used to be called the Heap engine.  Basically, as you would expect, it stores everything in RAM similar to something you would write yourself using hashmaps in java.  This can really really speed things up in certain situations, however, limits the amount of data you can deal with in one table quite a bit.

Merge: Allows grouping of identical MyISAM tables so they can be referenced at the same time.  I haven’t really run in to a need for this (some uses in datawarehousing and I guess transactional safety is less of an issue in reporting)

Archive: Designed for large data sets that do not change often.

Federated: Allows for databases spread across many machines to be viewed virtually as one.  I’ve never tried it this way.

NDB (NDBCLUSTER):  Another engine designed to spread a database across multiple machines.

Understanding MySQL Performance Bottlenecks and Logs

Bottlenecks
1. Hardware
2. Software
a. Configuration
b. Tools and Procedures

Informational Logs
1. Slow query log
2. General log
3. Error log

In deciding the best approach to scaling your system, one of the most important concepts is that of bottlenecks.  Bottlenecks are the limitations of your current system that have been reached first and prevent the maximization of other resources as the load on your system increases.  So, obviously, the identification of bottlenecks is critical to knowing what improvements, if any, will allow your system to perform faster and handle more data.  In this section we will discuss how to analyse your system to identify these bottlenecks and the types of solutions that certain bottlenecks suggest to us.

Informational Logs

MySQL has a few different logs that can be utilized to identify bottlenecks and monitor server status.  Each of these is enabled and managed through the MySQL configuration file, my.cnf.  This file in linux will most likely be located in /etc/mysql or just /etc.  On windows, look for my.ini rather in the windows directory, c:, or the MySQL instalation directory.  MySQL actually looks for the file in this order so make sure you modify the one that it finds first.  The first type of log we will discuss is the slow query log.  To enable this log in your my.cnf file add this in the [mysqld] section.log_slow_queries = 1;

slow_query_log_file = <path to slow log>/mysql-slow.log

Additionally you can specify long_query_time=<some number of seconds> to tell it how long a query has to run for you to consider it slow.  Other options are available in the mysql documentation.  One of the most helpful is log-queries-not-using-indexes.  Monitor this log over a period of time to identify indexes that need to be added and queries being submitted that need to be optimized.  MySQL is fast so long as the amount of data being operated upon at any given time stays within memory.  Once swapping to disk begins, it’s performance declines dramatically.  The size of your available ram in relation to the amount of data being processed is a common source of bottlenecks.  Using this log to identify the queries that are taking a lot of time will give you hints as to whether this type of paging to disk is going on.  Increasing ram and adjusting queries to minimize paging will allow you to process much larger tables in a reasonable amount of time.

The general log is enabled by adding general_log_file=<path to general log/mysql.log or whatever file name you wish.  This contains information about all queries that have been run including selects and who, that is, which database user, attempted to execute them.  Whatever directory you chose for your logs, it is usually a good practice to put them all in the same place make sure that mysql has the correct permissions necessary to write to that directory.  Otherwise, it is not critical to name or place these files in any particular manner.  Just choose a strategy that works for you.

Finally, you can log errors with the error log by adding log-error=<path to error log>/mysql-error.log  This, as can be expected, logs errors and if configured to, warnings generated by the database.  Otherwise, errors will be logged to /var/syslog in linux.  The error log helps in diagnosing startup and shutdown issues as well as failures and warnings that impact performance and stability.  When dealing with large tables, disk space limitations and slow queries due to memory issues, failures resulting in logged errors will be common until your system is properly tuned.

Use these logs to identify the source of trouble and once issues have been resolved and everything is running smoothly, you can turn them off if you like.  After all, logging is an expensive operation too.  With MySQL, disk space issues related to operational and informational logs are more common than those due to actual data storage.  Only enable logging that you need. Due to the wide range of errors and warnings that can be logged in these files, I will not go into detail regarding the interpretation of any particular one.  When looking for the cause of an error or performance problem, check the logs and search on google for more information.  MySQL is open source and the internet will have answers to most any problem you encounter.