Hadoop 2.0 Name Node, Secondary Node and Checkpoint node for High Availability

Question

After reading Apache Hadoop documentation , there is a small confusion in understanding responsibilities of secondary node & check point node

I am clear on Namenode role and responsibilities:

The NameNode stores modifications to the file system as a log appended to a native file system file, edits. When a NameNode starts up, it reads HDFS state from an image file, fsimage, and then applies edits from the edits log file. It then writes new HDFS state to the fsimage and starts normal operation with an empty edits file. Since NameNode merges fsimage and edits files only during start up, the edits log file could get very large over time on a busy cluster. Another side effect of a larger edits file is that next restart of NameNode takes longer.

But I have a small confusion in understanding Secondary namenode & Check point namenode responsibilities.

Secondary NameNode:

The secondary NameNode merges the fsimage and the edits log files periodically and keeps edits log size within a limit. It is usually run on a different machine than the primary NameNode since its memory requirements are on the same order as the primary NameNode.

Check point node:

The Checkpoint node periodically creates checkpoints of the namespace. It downloads fsimage and edits from the active NameNode, merges them locally, and uploads the new image back to the active NameNode. The Checkpoint node usually runs on a different machine than the NameNode since its memory requirements are on the same order as the NameNode. The Checkpoint node is started by bin/hdfs namenode -checkpoint on the node specified in the configuration file.

It seems that responsibility between Secondary namenode & Checkpoint node are not clear. Both are working on edits. So who will modify finally?

On a different note, I have created two bugs in jira to remove ambiguity in understanding these concepts.

issues.apache.org/jira/browse/HDFS-8913 
issues.apache.org/jira/browse/HDFS-8914

score 13 · Accepted Answer · answered Aug 17 '15 at 16:35

NameNode(Primary)

The NameNode stores the metadata of the HDFS. The state of HDFS is stored in a file called fsimage and is the base of the metadata. During the runtime modifications are just written to a log file called edits. On the next start-up of the NameNode the state is read from fsimage, the changes from edits are applied to that and the new state is written back to fsimage. After this edits is cleared and contains is now ready for new log entries.

Checkpoint Node

A Checkpoint Node was introduced to solve the drawbacks of the NameNode. The changes are just written to edits and not merged to fsimage during the runtime. If the NameNode runs for a while edits gets huge and the next startup will take even longer because more changes have to be applied to the state to determine the last state of the metadata.

The Checkpoint Node fetches periodically fsimage and edits from the NameNode and merges them. The resulting state is called checkpoint. After this is uploads the result to the NameNode.

There was also a similiar type of node called “Secondary Node” but it doesn’t have the “upload to NameNode” feature. So the NameNode need to fetch the state from the Secondary NameNode. It also was confussing because the name suggests that the Secondary NameNode takes the request if the NameNode fails which isn’t the case.

Backup Node

The Backup Node provides the same functionality as the Checkpoint Node, but is synchronized with the NameNode. It doesn’t need to fetch the changes periodically because it receives a strem of file system edits. from the NameNode. It holds the current state in-memory and just need to save this to an image file to create a new checkpoint.

Looks like Apache did not document the feature properly.>* ThCommunication Protocols All HDFS communication protocols are layered on top of the TCP/IP protocol. A client establishes a connection to a configurable TCP port on the NameNode machine. It talks the ClientProtocol with the NameNode. The DataNodes talk to the NameNode using the DataNode Protocol. A Remote Procedure Call (RPC) abstraction wraps both the Client Protocol and the DataNode Protocol. By design, the NameNode never initiates any RPCs. Instead, it only responds to RPC requests issued by DataNodes or clients. — Ravindra babu, Aug 17 '15 at 17:12
Created two bugs in jara : https://issues.apache.org/jira/browse/HDFS-8914 and https://issues.apache.org/jira/browse/HDFS-8913. Hope to get better content in documentation — Ravindra babu, Aug 20 '15 at 15:54
Content sourced from http://morrisjobke.de/2013/12/11/Hadoop-NameNode-and-siblings/ — Andrew, May 11 '16 at 06:48

score 0 · Answer 2 · answered Jan 27 '19 at 05:18

NameNode- It is also known as Master node. Namenode stores meta-data i.e. number of blocks, their location, replicas and other details. This meta-data is available in memory in the master for faster retrieval of data. NameNode maintains and manages the slave nodes, and assigns tasks to them. It should be deployed on reliable hardware as it is the centerpiece of HDFS. Namenode holds its namespace using two files which are as follows:

FsImage: FsImage is an “Image file”. It contains the entire filesystem namespace and stored as a file in the namenode’s local file system.

EditLogs: It contains all the recent modifications made to the file system about the most recent FsImage.

Checkpoint node- Checkpoint node is a node which periodically creates checkpoints of the namespace. Checkpoint Node in Hadoop first downloads fsimage and edits from the active Namenode. Then it merges them (FsImage and edits) locally, and at last it uploads the new image back to the active NameNode. The checkpoint Node stores the latest checkpoint in a directory. It is structured in the same as the Namenode’s directory. It permits the checkpointed image to available for reading by the namenode.

Backup node- Backup node provides the same checkpointing functionality as the Checkpoint node. In Hadoop, Backup node keeps an in-memory, up-to-date copy of the file system namespace, which is always synchronized with the active NameNode state. The Backup node does not need to download fsimage and edits files from the active NameNode in order to create a checkpoint, as would be required with a Checkpoint node or Secondary Namenode, since it already has an up-to-date state of the namespace state in memory. The Backup node checkpoint process is more efficient as it only needs to save the namespace into the local fsimage file and reset edits. One Backup node is supported by the NameNode at a time. No checkpoint nodes may be registered if a Backup node is in use.

Hadoop 2.0 Name Node, Secondary Node and Checkpoint node for High Availability

2 Answers2

Linked