Data Protection for the LAMP Economy

The value of data stored in LAMP applications is increasing at an exponential pace. Indeed, the LAMP stack fuels an economy of its own - with its own currency, lingo and players. While e-commerce is the clear and present evidence of the LAMP powered economy, the currency for this economy is by no means just monetary. Value is manifested in many factors other than financial gains: personal reputation and legacy, karma points, creativity etc. The LAMP stack fires up innovation by enabling new ideas - you can quickly and cost-effectively prototype a concept which other’s may find bizarre.

User generated content (UGC) is one key currency of the LAMP stack. UGC, even votes (ok, diggs) on other’s UGC store tangible and lasting value. While naming “You”, a proxy for UGC, the Time’s Person of the Year 2006, The Time magazine said: “It’s a story about community and collaboration on a scale never seen before. It’s about the many wresting power from the few and helping one another for nothing and how that will not only change the world, but also change the way the world changes.” With current trends UGC (most of which is stored in LAMP stacks) will continue to pack in increasing value for companies and communities around the world.

Since the cost of deploying applications on the LAMP stack tends to be very low, sometimes their importance to the enterprise may not be perceived accurately. The LAMP stack moves the value of IT infrastructure to business data and applications, which is exactly where it should be (instead of costly underlying technology). The importance of protecting LAMP data can easily be gauged by looking at all LAMP based applications that you rely on. Whether you are using a vBulletin or phpBB based forum for your users, or SugarCRM for your sales force, or Mediawiki for your corporate wiki - loss of data in any of these instances will result in at least lost productivity, if not lost revenue and reputation.

The LAMP economy comes with its own set of challenges and hazards, e.g. crowdhacking, comment storms etc. Dealing with these challenges is especially challenging for IT managers since it is extremely hard to get scheduled downtime on LAMP applications which power a busy website. In addition, such an environment has its own requirements as far as point of time where data should be recovered to (aka Recovery Point Objective). For example, owner of a web based forum may want to recover their data to a point just before a rogue user created a login and started vandalizing the forum.

Several dynamics make data protection for LAMP based applications a more challenging problem than traditional environments. For one, the data stored within LAMP stacks in many cases does not have any physical record. E.g. nobody keeps a printout of all threads in a forum. So, if LAMP data is lost and cannot be recovered, you would simply need to live without that data - there is no way to recreate it from any physical records.

Data in LAMP applications is stored both in MySQL databases and filesystems (typically configuration data). LAMP applications have a tendency to scale out instead of scaling up. One application may be spread across multiple servers (either in form of MySQL Cluster, or simply independent aspects of application distributed on independent LAMP stacks). The application administrator has to take into account multiple servers and locations of their LAMP data while putting together a backup strategy. In such an environment, creating a point-in-time consistent backup is a challenging task.

Frequently, LAMP based applications are hosted at a service provider’s site, instead of a captive data center. This provides additional challenges (and opportunities) for data protection. You will need to carefully plan how the recovery for the whole stack and application will take place at a different location from your hosting provider (e.g. in a hurricane Katrina like situation). In many cases, administrators will need to backup their LAMP data remotely using a secure communication protocol. An interesting alternative here is to use a remote storage grid (e.g. Amazon S3 service) to do the backup of the LAMP applications. Why bother with local tape hardware (and all the idiosyncrasies of tapes), when your data is remote anyway.

It is imperative that today’s IT managers assess the value of data stored in their LAMP stack based applications. They need to architect a backup solution for their LAMP applications based on the impact on application performance, application availability, type of failures to recover from, and the cost of implementing the solution. Administrators need to pay attention to the data in all layers of the LAMP application in order to get a consistent data backup for the whole LAMP application stack.

Zmanda is at an interesting place when it comes to the LAMP stack. We use the LAMP stack in our own products - the new Zmanda Management Console is developed on it, and we are focused on making it simple to protect the value of LAMP application data. Our open source projects extensively use wiki and forums for community collaboration and communication. Our products provide data protection for the entire LAMP stack. Amanda is the leading backup solution for a network of Linux filesystems. Our Zmanda Recover Manager for MySQL product is one of the most popular solutions to backup MySQL databases.

Adspace Mall DisplayOne extensive user of the LAMP stack is Adspace Networks - the largest in-mall digital audio/visual network in the United States. LAMP data is not just displayed on a small browser window - Adspace shows its customer’s advertising artwork on sixty inch plasma displays mounted in 8 foot tall enclosures! Using Adspace’s LAMP based applications, retailers can create new campaigns and upload advertising artwork - which then shows up on the huge screens. LAMP stack enabled Adspace to create and deploy this application in a very aggressive timeframe. Wide availability of LAMP consultants and hosting providers was seen as a big plus while deciding the application framework. Adspace needed a solution which could backup their data more frequently than the traditional nightly backup (due to high value of the customer data). While their applications are deployed at a remote hosting site, they wanted to keep the backup data on their own site. Adspace deployed Zmanda Recovery Manager for MySQL to backup their LAMP data. This solution enabled them to create a consolidated backup solution with point-in-time recovery capability, without having to spend time and expertise in building and architecting a LAMP backup solution.

While Google and Yahoo are the most likely destinations if you want to search for something, if you want to discover stuff one of the coolest destinations is StumbleUpon. StumbleUpon is helping more than 2.1 million users discover and share interesting websites. “Collaborative Opinions” is the currency that StumbleUpon trades in. LAMP stack stores this extremely valuable data of users, their preferences and friends, and all the websites they discover. StumbleUpon stumbled upon Zmanda’s LAMP backup solutions when they were looking to reduce the time and complexity of backing up this data, which is increasing by the minute. The rapid growth of their data mandated incremental backup of their database - full backup every time was just too time-consuming. Today they use Zmanda Recovery Manager for MySQL to backup their database and manage the backed up archives. StumbleUpon implemented Zmanda’s backup solution for its simplicity and effectiveness for their exact needs.

Protecting the LAMP Stack

Comments are closed.