4

While doing "hadoop namenode -format", below message comes.

Re-format filesystem in Storage Directory /opt/data/temp/dfs/name ? (Y or N)

What should one give here? "Y" or "N".

If given Y, will it lose data from HDFS?

earl
  • 636
  • 9
  • 28

1 Answers1

1

This question will be prompted only when the dfs.namenode.name.dir already exists i.e., the directory is either formatted already or an existing directory is mapped to dfs.namenode.name.dir.

If you wish to Re-format it again, then give Y else N.

On giving Y, the directory will be formatted, deleting all the existing metadata (fsimage and edits logs). This re-format removes only the metadata, the data dfs.datanode.data.dir directories must be manually removed.

franklinsijo
  • 15,481
  • 4
  • 32
  • 53
  • Time and again, I am getting the issue of "Could not obtain block length". There is a source which is writing records to HDFS and in logs, time and again I see this 'Could not obtain block length" issue. Also, when I open a file of HDFS, it does not open it and instead throws this block exception. When I do hdfs fsck /, It gives output as 'HEALTHY' for /. But still this block length exception comes. When I restart hadoop daemons, I am able to open the files. But the hdfs writer from that source still throws the block exception – earl Mar 15 '17 at 10:07
  • When I did "hadoop namenode -format" and selected "Y" on prompt, datanode daemons did not start in any of the slaves. Only nodemanager was running there. I then had to delete directories from hadoop.tmp.dir. Formatting namenode and restarting daemons this time started datanodes but I lost all my data since I deleted directories from hadoop.tmp.dir. I am not able to figure out:- Why that block length exception is coming time and again and what is the best sequence of steps jto format namenode and start daemons – earl Mar 15 '17 at 10:11
  • What is the source? I suspect the source is not closing the files properly. – franklinsijo Mar 15 '17 at 10:12
  • @jumbo, I have mentioned in my answer, after format you have to manually delete the data directories before starting the cluster again. Formatting will delete the metadata, and generates a new ID for the namenode. The datanodes also must have the same clusterID to be part of that cluster. So, removal of data directories is required. If you should not lose data, you should not do formatting in the first place. – franklinsijo Mar 15 '17 at 10:16