Energy and cooling, maintenance and support, physical space and continuous licensing and hardware renewal cycles, these are just a few variables that make up the Total Cost a data center infrastructure.

These and other items that go into the account need to be predicted and well-sized among the companies that maintain and manage the data storage generated by the business.

Although a proprietary data center infrastructure is costly, there are some alternatives that can be adopted to reduce financial impacts. It includes data de-duplication.

Let's show how data de-duplication can leverage storage space and increase performance.

What is data duplication?

Let’s say you have 100 virtual machines. If each is running its own operating system, there are no fewer than 100 similar operating systems installed and possibly stored.

Like this one, there are other operations of high redundancy, that is, that have several coincident data. An example is the backup performed periodically. Much of what is being stored already has a corresponding destination. Result: Space in the data center may be being occupied by repeated information.

How does data de-duplication work?

Every data segment that is inside a storage array has a kind of fingerprint, that is, a unique ID. Whenever new data is taken into the array, the segments are analyzed to see if there are any correspondents already stored.

If the segment is unique, it receives its fingerprint and the entire copy of the data is stored. But, if there is already a similar segment, what is entering the matrix receives only a small reference that refers to the existing one, preventing what is copied from being duplicated.

Therefore, instead of copying all the data again, the de-duplication system saves only a reference that will certainly occupy less space than the original data.


Potentializing disk space is one of the biggest advantages of de-duplication. In addition to reducing storage costs – since the same space can store dozens of times the original volume – efficient use of disk space can even increase the retention period. That is, in addition to reducing the investments in hardware, the life of the disk is extended.

Redundant data elimination further streamlines the process of data recovery in the event of a disaster.