MLens is Knowledge Lens’ one-stop solution for enterprises’ big data requirements. With Big Data Backup, Automated Disaster Recovery, Data Ingestion, Compression, Encryption, and Archival capabilities, our solution has proven to deliver absolute control over enterprise data, its storage, analysis and recovery.
This seven-part blog series will take you through the different scenarios where MLens was deployed in our clients’ organizations, to solve their business challenges, and add value to all on board. In this post, we will look into our client’s requirement for the live replication of data across secured HBase clusters.
HBase is a non-relational database which stores its data in Hadoop and provides random access to data through queries. Unlike Hadoop, it can be used for quick storage and query of real-time data. However, data transfer across multiple HBase clusters is not without some complications.
Our client’s HBase clusters did not have direct connectivity between the region servers, and had different Kerberos realms, versions and distributions.
Existing tools like CopyTable, Import/Export of table, and Snapshots couldn’t address all these complications. Moreover, these tools required manual effort, and posed the threat of degradation in HBase cluster performance.
They required a fast and secure means of data transfer across multiple HBase clusters, without the use of any local disk storage.
The Knowledge Lens Solution:
After having a look at the client’s architecture and topology, the MLens team developed a HBase replication tool that addressed all the client concerns. Instead of querying the HBase servers, the tool read the data directly from HFiles, thereby minimally impacting the Hbase region servers’ performance. A highly efficient parser of HFiles was designed to read data from the source HBase cluster and copy it securely to the target HBase cluster.
HBase stores data in the form of key value pairs and each key value has a timestamp associated with it. The MLens HBase replication tool supported incremental data migration using these timestamps that are a part of the key value pairs.
Why Knowledge Lens?
By using unique sets of algorithms to compare HBase columns, the HBase replication tool provided a seamless replication of two live HBase clusters. Thus, our client was enabled to achieve the unique proposition of near real-time replication across multiple live HBase clusters.
Looking for more resources? Read our previous posts in the series here:
Read our customer success stories here-
At Knowledge Lens, we constantly work towards improving our Lenses, so your business can do more for you. Visit us here to learn how you can grow your business operations through data- driven decision making, starting today.
Contributors: Rupak Das, Technology Lead, Knowledge Lens.