Retrieve or Rollback: Cloud Object Storage Data Versioning to the Rescue

Unique Content from MSys Technologies' Thought Leaders Demonstrating How to Put Digital Technologies to Your Benefits

Ghost of the Lost Data

Significant files were accidentally deleted in a university in Wellington. “Huge volume” of IT calls baffled the IT admins as well as the post-grad students, some of them just a couple of weeks from handing in their thesis. A few corrupted files held up more than $130 million transfers for JPMorgan Chase, which they blamed on some “Oracle bug.” News like such often scares the business leaders into rethinking their data storage infrastructure. RAID errors, disk errors, and, of course, human errors can lead to some devastating data losses beyond recovery. In a time when the businesses are more and more dependent on SaaS tools and Cloud Native development, how would you ensure that you have your data backups ready for unforeseen calamities? There are several known ways in which this question can be answered. But the two most popular ways that fit with all kinds of cloud object storage infrastructures are:

  • Data Redundancy and
  • Versioning

While we already discussed Data Redundancy in another article, this one would specifically focus on Versioning in detail. We will begin with explaining the concept and need of it and then move forward with more technical details.

Versioning, the friendly neighbour

In addition to redundant geo-replicas of data objects, data loss can also be protected by archiving their variants throughout history. Versioning simply denotes a process where these data object variants are stored in the same bucket. This makes the data restore and retrieval faster. Thus, while data redundancy is more of a globally spread ally, versioning is like a next- door neighbour that can immediately come for help. It is a powerful tool for data protection, especially during times when remote work and global access have lowered the guards for cloud object storage. Any instance of data corruption can be easily dealt with by rolling back to a previously most reliable version. Be it human error or some application failure, maintaining versions for data objects can provide more immediate damage control, hence, much lower outages for the end customer.

Versioning is a very cloud savvy solution for data durability. Object storage in cloud infrastructure is not only scalable but also easier for data retrieval thanks to the shared storage pool. So how does Versioning exactly help, and how do you implement it for your cloud storage infrastructure? Let’s find out.

Different Versions of Aid

Fool-Proof Process

When you enable versioning for the data objects, the variants of these data objects don’t overwrite their predecessors. General practice is to store the object versions with different version IDs, thus making it easy for Cloud Storage to retrieve only the latest version for transactions as long as there isn’t a need for a rollback to the previous version. Moreover, having different version IDs reduces the impact of human errors while working with the data objects.

Operational Security and Authorization

Rather than accidental overwrites, data loss gets more severe in case of accidental permanent deletes. Therefore, Versioning is also subjected to authorization and security. Take the Amazon S3 approach, for instance. Here, for the purpose of initial protection (a first-aid, if you may), the deleted object isn’t permanently deleted and just has a delete marker attached to it. Thus, when the user tries to GET the data object, the system throws a 404 Not Found error. This means that the latest version of the data object is practically “deleted.” However, to permanently delete these data

Faster Data Recovery

While retrieving the redundant geo-replicas might take a little time, Versioning allows faster data recovery, especially in the face of minor accidental overwrites and deletions. All one has to do is roll back to the previous version ID and continue the work without interruption. This also makes object storage versioning an effective weapon against data outages.

Additional Security Measures

Apart from the initial authorization and the delete marker, Versioning also complies with measures like Multi-factor authentication, where the user has to be authorized from multiple sources before allowing them to make any permanent changes in the data object versions. Such additional measures also make these versions more secure from external attacks due to data breaches and unauthorized access. Add to this the cloud security measures against phishing, DoDs, and ransomware; the cloud object storage gets practically super-immune.

Let us now look at how you can enable versioning for your cloud object storage.

How to version?

Here’s are the steps to enable versioning for your bucket.

STEP1: Uploading the File onto storage

The file is uploaded onto the storage to get a unique ID (known as version ID, generation number, or something else, depending on the cloud storage vendor of your choice). The version ID, as discussed above, will hold the current version of the data object for future reference.

STEP 2: Request Versioning

Next, you request to enable versioning for your data object. By default, most of the vendors have the versioning disabled to ensure unnecessary replication. The XML request sent has a status added to it, thus requesting the versioning to be enabled or disabled. The versioning request is approved by the authorized bucket owner.

STEP 3: Enable/Disable Versioning

The bucket owner can enable the versioning for their respective object storage buckets using multiple ways as offered by the cloud storage vendor. Some of the popular ones are:

  • Storage Management Console – A GUI based console where the owner can sign in, select the intended bucket and enable or disable versioning
  • Storage Management CLI – The command line interface or CLI works with textual commands to respond to the versioning request
  • Storage Management SDKs – For users working with Java, .Net, or similar languages, SDKs work best for the purpose of dealing with versioning requests.

Conclusion

IBM suggests that millions of USD can go into putting out the fires caused by data corruption and data loss. Be it a big university responsible for encouraging the researches and innovations of its pupil or a multi-national IT firm aiding the world with its simplistic solutions. The value of data cannot be compromised for performance speed and intelligence. Versioning can be the defensce you need against accidental data corruption due to any overseen security flaw or internal human error. Enable versioning for your critical data objects and leave the rest to the scalability, access security, and speed of cloud object storage.

See us in action,
kick-start the project

CTO Network Newsletter

Join 10,000+ Product Leaders for latest technology updates

This field is for validation purposes and should be left unchanged.

Talk to Our Engineering Experts

This field is for validation purposes and should be left unchanged.