SO MUCH DATA - SO WHAT DO WE DO WITH IT?

In today’s digital world we're immersed in a sea of data that just keeps coming

Most of the data we create is rarely accessed but much of it must be stored for analytical purposes to stay competitive in the global commerce. How do we deal with petabytes or even exabytes of data economically, securely and accessibly? The answer just might be an active archive with help from LTO Technology. Read on for the details!

WHAT IS AN ACTIVE ARCHIVE?

Some might already be using an active archive or are considering it. Others may think they are utilizing an active archive but are really just saving backup copy iterations of original data. Let’s distinguish between the two.

Backup Copy: A backup is a copy of data that is kept for a limited period of time to restore the original data in the event the original is damaged or compromised in some manner (e.g. held for ransom)!

Archive: Copy An archive is designed to hold a collection of data that is not modified and is intended for long term retention and reference (e.g. analytical mining).

So, as you can see, a backup is temporary and is created to protect the original copy whereas an archive is a collection of original data kept intact for historical and analytical purposes. There may be multiple copies of a backup file, whereas an archive is ideally the only one of its kind.  Moving original data to an archive can free up backup space by helping to conserve storage resources.

But back to the question: what is an active archive?

To answer that, first we need to talk about digital archives in general, in particular a deep archive the may hold hundreds of terabytes, many petabytes or even exabytes of data.

A deep archive is a collection of data that will likely not be accessed but must be kept for compliance or other legal reasons.

 

An active archive on the other hand is a data collection that likely will be accessed, albeit infrequently, for a variety of reasons including competitive analysis and customer inquiries. An active archive is a data store that is too valuable to discard and as noted by SearchStorage on TechTarget: “An active archive may include software for moving data seamlessly between storage tiers so it can be directly accessed by applications or users. Active archives are usually protected with replication to another archive system, not just by traditional backup.”

As noted, when it is software defined, cold, warm and hot data can be moved between tiers of flash, disk and tape according to criteria such as who is using the data, how long they need frequent access and the applications they are using.  What this also means is that the distinction between the different storage platforms becomes less important. So long as data is indexed effectively, and there is sufficient metadata to be able to identify individual pieces of content so they can always be located, the exact nature of the storage platform becomes less relevant.  The user simply sees one single, all-encompassing storage system – sometimes referred to as a namespace – which contains all the data in existence.

If content is frequently requested, it may reside on a higher performance tier.  But if it is older and accessed less often, it might reside in the cloud or on nearline or offline LTO tape.  The tradeoff between slower access time versus dramatically reduced costs is one of the cost/benefit calculations that an active archive allows users to make.  And tape is an ideal enabler for this kind of active archive topology.

Let’s see how a user put a data archive into action.

ACTIVE ARCHIVE IN ACTION 

GPM-ETV is a Moscow based entertainment television broadcaster that has a number of very popular channels. As you can imagine, the making of these broadcasts generates a large amount of data that needs to be stored for editing and final productions as well as long term for posterity and occasional future reference. GPM-ETV wanted several key attributes for the storage of their production data:

  • A reliable digital content archive
  • Easy access to objects
  • Long term data preservation
  • Free up expensive disk storage space

The production broadcaster implemented two LTO tape libraries with up to 4 petabytes of storage.  Replications of the data are stored on each library for added protection.  Rarely used material is stored on LTO tapes outside of the library which helps conserve library space. According to GPM-ETV the storage costs were cut in half by using LTO tape compared to using disk. See the case study here.

Implementing an active archive can help deal with the never ending surge of data.  It can conserve backup storage space, preserve collections of information, allow for the mining of data to remain competitive and LTO technology can help protect the data and save costs. See more about active archiving in a 2 minute video here.