Friday, February 19, 2010

MySQL Replication Explained

One of the many features of MySQL to keep your database live and productive is MySQL Replication.

Note: this entry will cover asynchronous replication

So whats this technology all about anyways?

Replication is data that is reproduced in one or more locations. It consists of two or more servers, normally, in its simplest form, one master server and its slave server(s).

Whats so good about replicating my data?

Replication can be used to solve data availability problems during backup practices, high traffic on servers, data corruption problems, and geographic distances between data.

Backup Scenario:

Ever had to backup from a table that is currently in use. You have to lock the table so no reads occur (InnoDB storage engine also blocks COMMIT statements) at the time you try to backup. So you have to figure when your traffic load has decreased picking a time when its most convenient for your users. When you have figured this out only then can you place a lock on your tables that you are going to backup and remember to unlock them thats always important to some lol. Although you could automate this with a script, another way to do it is run your backup on your slave server.

Scaling Out Scenario:

This is useful when your server is receiving large amounts of traffic daily. Typically found on a dynamic web application that received alot of hits concurrently and relies on data intensely. When data traffic is on high demand, the performance on the backend usually struggles with lots of reads and writes to the disk. In this instance replication could split the reads and write up. The DML statements like INSERT, UPDATE, DELETE commands would be handled on the master server, while the SELECT statements would be handled entirely by the slave server(s). This gives off the affect of more performance because you are using not one but multiple CPUs to cut the load (depending on how many slaves you have up and running). This makes for efficient use of your servers. Dedicating one to write to the disk while the other(s) read from there local copy of that data.

Corruption Scenario:

Ever deal with corruption? It can leave you debilitated. If you dont have an up to date backup that could mean data loss for that period in time. It can mean loss of productivity, user-base, and revenue until the database is restored.

If you have a slave server you could use that database as a fail over to alleviate the problem. Changing the slave to the master and vice versa. The beauty of asynchronous connections.

Geographic Location Scenario:

This scenario is with the thought that your networks are continuously congested or have a connection that isn't ideal for the data transmitted between the two locations. For this reason replication would be best; rather than pulling data from that location you would have a slave with the same data to do your queries on.

How does it work?

Its all about logging. MySQL Replication uses logs to keep track of changes on the master before they are ported to the slave(s). The master's log is called the binlog. It holds the changes to the database as the name suggests in a binary format. The slave server also contains a log called the relay log. It contains a local copy of the changes the slave needs to keep up speed with the master. Both logs have the position so the slave could tell where it left off and begin its recording of events.

More in depth the process includes 3 threads, the I/O thread, Binlog Dump and the SQL thread.

  • I/O Thread - the START SLAVE statement is issued on the slave server. Connects to the master and requests an updated binary log.
  • Binlog Dump - Sends the binary log contents to the slave. 
  • SQL Thread - Slave begins to read and execute the updates present in the relay logs. 

Slave I/O thread reads the updates that the master Binlog Dump sends and copies them to a local file called the relay logs in the slaves data directory then executes them in the SQL thread.

The slave also creates 2 additional files called the status files which contain information like the output of SHOW SLAVE STATUS statement and can survive after shutdown. Upon start up, the slave reads these two files to determine where it left off reading the binlog files and the process begins again.

Initially you want your your master and slave database servers to be identical, so you place a lock on the master and take a copy of the database you want to replicate.

Good practice:

A word of thumb you might want to setup a user with replication privileges rather then giving him ALL from the master. If you want to add some more security restrict where this user can connect, specify a host. Ive tried to restrict the databases the user has access to however this doesn't work as of yet.

No comments:

Post a Comment

Please leave me a few lines and tell me what you think.