Our Client is one of the world’s leading biotechnology companies. It is a value-based company, deeply rooted in science and innovation to transform new ideas and discoveries into medicines for patients with serious illnesses.
The bio-tech company’s Enterprise Data Lake consisted of multiple Hadoop clusters which were hosted on premise in their network. These clusters contained over 300 terabytes of processed and unprocessed datasets critical to business function.
The company needed a clean and efficient disaster recovery solution which would do the following:
- Backup datasets which are regularly updated on a daily/weekly basis to AWS S3
- Backup cluster configurations regularly to AWS S3
- Backup all cluster metadata and logs stored in Oracle tables to AWS S3
- Restore the on-premise cluster from the backed up configurations
- Restore the entire data backed up in AWS S3
- Restore all metadata and logs backed up in AWS S3
MLens from Knowledge Lens was the chosen solution which satisfied all conditions of backup and recovery.
- Backup datasets which are regularly updated on a daily/weekly basis to S3 – MLens Data Migration HDFS backup synchronized HDFS directories to AWS S3 buckets and scheduled incremental backups which detected changes in the datasets and only transfered the updated files.
- Backup cluster configurations regularly to AWS S3 – MLens Platform Migration Backup connected to Cloudera Manager and kept backing up the latest configurations as per the defined frequency.
- Backup all cluster metadata and logs stored in Oracle tables to AWS S3 – MLens Data Migration RDBMS backup feature backed up tables from Oracle Database using connection details, table names and specific queries.
- Restore the on premise cluster from the backed up configurations – MLens Platform Migration Restore created a new Hadoop cluster based on backed up configurations from AWS S3 and by mapping new hosts to the old ones for a disaster recovery scenario.
- Fast recovery of the entire data backed up in AWS S3 – MLens Data Migration Restore feature recovered the directories which were backed up with MLens Data Migration job. Using it’s distributed framework it ensured that the recovery is fast and business impact is minimal.
- Restore all metadata and logs backed up in AWS S3 – MLens Data Migration Restore simply restored the RDBMS tables and records that had been scheduled for backup using MLens Data Migration RDBMS job.
- Zero downtime achieved by MLens unique feature of Query support directly on Live backups.
- MLens software delivered high speed parallel data processing without landing zones.
- It supported compression and format conversion during backup.
While there are many data ingestion, backup and recovery tools available, no other solution provides such a wide range of features as MLens, which not only ensures accurate data migration but also recovers the cluster and the data.Key takeaway