To CDM or not to CDM?
One of the areas that we are seeing becoming increasingly relevant within the enterprises we work with is the concept of Copy Data Management (CDM). It's a very simple concept but an extremely powerful one. It's all about the control and management of all those copies you make on your storage systems. It makes logical (pun intended) sense as the next area of management to help control "sprawl". We've seen many of these types of initiatives to help control sprawl throughout the past 15-20 years.
First of course, centralized storage to bring your storage together and (in theory) use less storage and provide a more central way to manage it. Next physical servers were consolidated down to more dense environments through blades and highly scalable servers. That was the first initiative to try to control physical server sprawl. Remember when we did this because we had lots of physical servers (remember those racks full of x86 pizza box 1U servers?) that we wanted to consolidate. It all made sense. Severs consolidated down to smaller form factors helped to save on power and heat/cooling costs and took up less space in racks. This was to stop, or at least try to control and get a handle on, physical server sprawl.
Next came virtualization, which made things even more efficient. The initial selling point was to use the whole server you bought. We all remember that physical servers were generally run about 5-10% of their capacity. Virtualization allows that to be brought up to 60-80% which was a giant leap forward in efficiency. But what did it create? Virtual server sprawl. VM's and virtual instances were so easy to create so what did we do? We created a lot of them. Then we needed tools (started with vCenter, but many others too) to help us manage our virtual environments and control our virtual server sprawl. It was one of those consequences of not being able to keep a good handle on all of the VM's we created, so the tools emerged.
Then, by the rich functionality that was created by our storage systems to allow for the easy creation of copies (pick your vendor here, IBM, NetApp, EMC, Pure Storage, etc.), we found it really easy to create copies for operational purposes (BU, DR, BC, etc.) or business driving purposes (DevOps, Test/Dev, etc.) to help drive additional business opportunities. Along with that what did it create? The ability to give people unfettered (not really, all about permissions but..) access to make all the copies/snaps that they needed. And then what did we get? Much and many more snaps/copies than we needed. Garter says that as much as up 20 times as many copies of data are created and used versus production data. That's a lot of copies potentially wasting a lot of space. And because of that what do we need now? A way to manage our copy/flash sprawl. That's what CDM is and where it comes into play.
So as I wax nostalgic here, where am I going? CDM is a natural evolution from where we've been. We need a way to be able to better manage our copies. Yes, the integrated tools within our storage platforms can perform some of that function. But can they do it in as robust a way as specific CDM tools? They answer today is no, they can't. Integrated tools generally don't work in heterogeneous environments (even within the same storage vendor) to be able to manage copies from the same management console. That's where the power of these CDM tools come into play.
As I mention above, the value of CDM is generally looked at as driving value in two different categories: Operational efficiencies and driving business value.
CDM and operational efficiencies are around backups (BU) and DR and, in some cases if designed and architected correctly, to help replace traditional BU tools and environments. It can also be used to help in application specific instances such as being able to quiesce a database before allowing the copy/snap to be taken. Multiply this by the many copies that need to be taken for hourly/daily/weekly/monthly backups, disaster recovery, and business continuity and the testing of each and you can see how the number of copies can grow quickly and spin out of control.
For driving business value, the ability to manage copies for Test/Dev, reporting, analytics and modeling is of value that can immediately add to the bottom-line. This is where CDM can excel. Quicker deployment of apps and more efficient DevOps and app development environments mean quicker time to value. And that saves money or helps make it faster. The management of copies in these instances can create substantial efficiencies towards that goal.
The additional of value of CDM comes into place in being to automate and create policies for how and when copies are taken, used, and when the expire. Can you do this with the integrated storage tools? Yes, to some degree, but certainly not to the granular level you can with CDM solutions.
Also as backup windows continue to shrink, CDM solutions can help in creating accurate and validated copies to be used for backups or DR testing and rollback in a much more automated fashion that other types of tools with much less impact on production systems.
There are different types of vendors that provide CDM solutions and the right solution for you, as always, is dependent upon your specific needs but I believe that they bring value to today's storage environments. If implemented correctly the value realized can be almost immediate: according do Gartner, you can save up to 80-90% of copy data storage. That is storage that can be used for other purposes and we all know that storage isn't cheap. We are all going to use more storage (whether we like it or not) as time goes by, so let's make it as efficient to use as possible. CDM can help with that.
So what of CDM? I believe it's well worth evaluating to see if it can make a difference in your environment. As I mention above, it appears to be the next evolution of an area that needs management help. If it can bring efficiencies and is as promising as Gartner says, it can easily pay for itself very quickly. But it's also a new way of doing things that might clash a bit with existing policies and procedures. All of that needs to be taken into account as you do your evaluation. Also, as fairly new concept, time will tell if it will remain a relevant standalone product or will we see storage vendors create their own robust CDM solutions (or buy a CDM vendor) to integrate into their storage platforms that compete with standalone CDM solutions.
So to answer my own title posed question: To CDM or not to CDM? If you make lots of copies of your data, it's looking like yeah, you probably should. Let me know your experience and thoughts on CDM how it might complement your environment.