What to consider when you back up a remote MySQL server with ZRM over the Internet
By Dmitri Joukovski and Pavel Pragin
The MySQL database has become the world's most popular open source database because of its consistent fast performance, high reliability and ease of use. You might be using MySQL database for your online forums and wiki located at managed hosting provider, or you might be using it in a remote office for tracking bugs with Bugzilla, or you might just be thinking about developing some new Web 2.0 application that uses MySQL. Either way, if you value information stored in your MySQL database, you will need to ensure successful, secure and consistent backups of MySQL with minimal impact on the database application. Make sure that your backup solution provides the most efficient use of network, server and storage resources.
If you are looking for a solution that simplifies your life by providing an easy to use yet flexible and robust backup and recovery for MySQL, Zmanda Recovery Manager (ZRM) might be the right choice for you. Details about ZRM functionality are available here.
For any backup of a database the primary considerations are the consistency of the backup and impact on users and applications. A backup of the remote MySQL, however, has extra challenges related to:
- network usage
- security, and
- flexibility of recoverability of MySQL data to a different host.
The last point could be important when you do not have full control of your MySQL environment and you want to have an option to recover your data to a different managed hosting provider with a different version of MySQL server or a different operating system.
Incremental backups significantly reduce backup window and network usage because only the changes since the last full or the last incremental backup are moved over the wire. ZRM makes it easy for you to recover your data from incremental backups even if you have to use multiple incremental backup images to get your data back to a particular point in time. The incremental backups require MySQL binary logs to be enabled, but according to MySQL documentation, enabling of binary logs will result in performance hit of less than 1%.
Logical backups provide more flexibility for recovery because the backup file is a text file that contains all MySQL statements to re-create both the database schema and contents. Logical backup works for all storage engines except the NDB engine used for MySQL clustering. The biggest advantage of logical backup is flexibility for a database recovery. You can restore logical backup of MySQL to other architecture and even to another database. Transportability of logical ZRM backup images makes ZRM a convenient tool for migration. For example, you can move your MySQL data:
- From MySQL on Solaris to MySQL on Linux
- From one storage engine to another
- From a 32-bit server to a 64-bit server
- From one managed hosting provider to your data center or another provider with different MySQL configuration
Of course, there is price to pay for such flexible recovery. Restoring data from the logical backups could take a long time since you have to read and replay all MySQL statements. Another downside is that it could be difficult to predict the size of your logical backup. Depending on the type of data and your database schema, the size of the logical backup could be larger that the database itself. One remedy is that since logical backup is basically a text file you can usually get a decent compression.
Raw backup provides a consistent copy of a database with your backup being a binary file. The advantages of raw backups over logical backups are:
- Backups and especially recovers are much faster. For example, it not unusual to see that for the very same database with size of 4-5 GB, the raw backup is 5 times faster than the logical backup, and recovery of the raw backup image is 20 times faster than recovery of the logical image.
- You will always know the exact size of your backup since it is just a copy of a database.
- It provides better scalability which could be important if your MySQL database is rather large, e.g. 10-20 GB or more.
Raw backups can be recovered ONLY to the same version of MySQL server on the same operating system as the original data. It means that your chances to recover raw backup images of MySQL to another managed hosting provider are not very high and you should take that into consideration when choosing raw vs. logical backup.
Both raw and logical backups provide warm backup, meaning you don’t have to shut down MySQL server for backup, but all tables are locked during backup and users can’t enter their data. That is why you should consider using ZRM scheduling plug-in that allows delaying backups based on thresholds you define. For example, you can postpone backup for an hour if more than 50 users access the database.
One of the important considerations for remote backup of MySQL is the decision about what type of connection to establish between ZRM and remote MySQL server. ZRM provides a plug-in for socket based connection and another plug-in for SSH-based connection. Flexible architecture of ZRM allows users to write their own plug-ins.
As the name suggests, the socket copy plug-in establishes a socket that provides communication between ZRM and MySQL over IP-based network. The socket copy plug-in requires the xinetd service to be running on the MySQL server and the default port 25300 to be open. If needed, a backup administrator can change the port. The socket copy plug-in is not secure and should be used only when security is not of concern or when security is established by other means, for example, when you have a VPN connection between ZRM and your remote MySQL server.
The SSH copy plug-in provides a secure channel between ZRM and the remote MySQL server. It uses public-key cryptography to authenticate the remote MySQL server and the backup user running ZRM. The SSH plug-in requires the standard TCP port 22 to be open and SSH daemon to be running. The SSH copy plug-in is best suited when ensuring security of backup data is important. Since SSH connection takes extra CPU cycles for encryption, there might be a hit on backup performance comparing to backup with socket connection.
The following table summarizes considerations when choosing socket vs. SSH copy plug-in for remote backup of MySQL:
|Remote connection |
|Port used||Security||Relative performance||Installation comments|
|SSH copy||22 (fixed)||Provides strong authentication and encryption for moving backup data over the wire||Lower performance and depends on MySQL server memory, CPU resources and available bandwidth.||Often times you might have SSH connection to a remote MySQL server already established for reasons others than backup and recovery. Otherwise you have to establish SSH connection between ZRM and MySQL server.|
|Socket copy||25300 (could be changed)||Backup data over the wire is not secure||Higher performance||Additional software should be installed on MySQL server. For example, for Enterprise version of ZRM 1.1 use MySQL-zrm-enterprise-socket-server-1.1-1.noarch.rpm. For Community version, see downloads page for exact package.|
For more technical details register free of charge to Zmanda Network and download the full version of this white paper. You will learn how to use ZRM for several common scenarios. For example, we will provide technical details how to perform backup with data being secure on the wire and at rest using the SSH copy plug-in for this use case:
Additionally, we will provide technical details about more efficient logical and raw backups using the socket copy plug-in.
With all its rich functionality, ZRM for MySQL is just a tool for implementing backup and recovery strategy that is optimal for your specific data protection needs. ZRM is robust and easy to use, but depending on your own implementation of the remote MySQL server and your specific requirements for backup and recovery, you should consider all trade-offs associated with each operational options provided by ZRM for MySQL.