Big Data Board: Hadoop Cluster Disaster Recovery Solution 1/2

Hi Folks, whenever we think about the cluster setup and design. We always think about DR how can we save our Data from cluster crash. Today we discuss about the disaster recovery plan for hadoop cluster, what would be steps we can take and how far we can save our data.

Type of Cluster design across data center

1. Synchronous Data Replication between cluster
2. ASynchronous Data Replication between cluster

lets talk about the Synchronous Data writing between cluster. Here is the pictorial view of data center design.

When a client is writing a HDFS file, after the file is created, it starts to request a new block. And the Active NameNode of primary cluster will allocate a new block and select a list of DataNodes for the client to write to. By using the new mirror block placement policy, the Active NameNode can guarantee one or more remote DataNodes from the mirror cluster are selected at the end of the pipeline.
The primary cluster Active NameNode knows the available DataNodes of the mirror cluster via heartbeats from mirror cluster’s Active NameNode with the MIRROR_DATANODE_AVAILABLE command. So, latest reported DataNodes will be considered for the mirror cluster pipeline which will be appended to primary cluster pipeline.
As usual, upon a successful block allocation, the client will write the block data to the first DataNode in the pipeline and also giving the remaining DataNodes.
As usual, the first DataNode will continue to write to the following DataNode in the pipeline.
The last local DataNode in the pipeline will continue to write the remote DataNode that following.
If there are more than one remote DataNodes are selected, the remote DataNode will continue to write to the following DataNode which is local to the remote DataNode. We provide flexibility to users that they can even configure the mirror cluster replication. Based on the configured replication, mirror nodes will be selected.

Synchronous Namespace Journaling

As usual, the primary cluster Active NameNode writes the edit logs to Shared Journal of the primary cluster.
The primary cluster Active NameNode also writes the edit logs to the mirror cluster Active NameNode by using a new JournalManager.
As usual, the primary cluster Standby NameNode tails the edit logs from Shared Journal of the primary cluster.
The mirror cluster Active NameNode writes the edit logs to Shared Journal of the mirror cluster after applying the edit logs received from the primary cluster.
As usual, the mirror cluster Standby NameNode tails the edit logs from Shared Journal of the mirror cluster.

Points to Remember

Synchronous Data writing is good when the data is very critical and we cant afford to lose consistency at any point of time.
It Actually increase the latency of hadoop data writing, which impact performance of the hadoop cluster.
Required more network bandwidth and stability to cope with synchronous replication.

4 comments:

AnonymousOctober 27, 2015 at 12:10 AM
We always follow configurations, tweaks, installations and other important stuff from this website related to hadoop because we were referred to this site by our hadoop online training center instructors. Thanks.
anirudhMay 27, 2019 at 12:01 AM
Thank you for sharing the article. The data that you provided in the blog is informative and effective.
Best Devops Training Institute
azure trainingsJanuary 17, 2025 at 3:15 AM
The webpage provides a tool for checking the domain authority (DA) of a website. By entering a website's URL, users can assess its SEO strength and credibility based on the DA score. This score helps understand how well a site is likely to perform in search engine rankings. For more detailed analysis, you can use the tool directly on the site.
azure data engineer training in hyderabad
Fusion HCM Online TrainingJuly 13, 2025 at 11:49 PM
When using a Hadoop cluster, it’s important to have a plan in case something goes wrong. This means creating backups of your data, saving copies in safe places, and preparing extra systems that can take over if needed. These steps help make sure your work isn’t lost and things can run smoothly again after a problem. Just like this, Fusion HCM Online Training helps professionals prepare for challenges by building strong skills and understanding key systems.
Fusion HCM Online Training

Sunday, September 14, 2014

Hadoop Cluster Disaster Recovery Solution 1/2

4 comments: