2

I think it is very good for me to understand Hadoop in detail if I can debug the real working mechanism of namenode and datanode in either pseudo-distributed or fully-distributed mode.

I think that apaceh hadoop team or.. any major developers of hadoop, hdfs or mapreduce can do this, but there is no document mentioning this method.

Every document mentioned how we can debug mapreduce with eclipse in local environment. This meanse LocalJopRunner is running and we can see that how LocalJobRunner run.

Thanks.

YoonMin Nam
  • 109
  • 1
  • 11

2 Answers2

1

You can find detailed information on this on the Developing Hadoop Wiki. It has detailed info on stuff like How To Setup Your Development Environment, How To Develop Unit Tests etc.

HTH

Tariq
  • 32,860
  • 8
  • 52
  • 78
1

To debug hadoop daemons as opposed to How to debug hadoop mapreduce jobs from eclipse? then you can add Java debug options to the /etc/default/hadoop-daemon-name

E.g. to debug the name node add the following to /etc/default/hadoop-hdfs-namenode

export HADOOP_OPTS="$HADOOP_OPTS -agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=8000"

Then you can remote connect to your namenode on port 8000 from eclipse. Obviously remove this after as it opens your name node to potential abuse from anywhere in the world!

Community
  • 1
  • 1
byeo
  • 626
  • 7
  • 17