Questions tagged [hadoop3]

Use for questions specific to Apache Hadoop 3.0 features (i.e. Erasure Coding, YARN Timeline Service v2, Opportunistic Containers, 3+ NameNode fault-tolerance). For general questions related to Apache Hadoop use the tag [hadoop].

99 questions
26
votes
5 answers

HDFS_NAMENODE_USER, HDFS_DATANODE_USER & HDFS_SECONDARYNAMENODE_USER not defined

I am new to hadoop. I'm trying to install hadoop in my laptop in Pseudo-Distributed mode. I am running it with root user, but I'm getting the error below. root@debdutta-Lenovo-G50-80:~# $HADOOP_PREFIX/sbin/start-dfs.sh WARNING: HADOOP_PREFIX has…
Sujata Roy
  • 307
  • 1
  • 5
  • 8
8
votes
4 answers

Could not find or load main class org.apache.hadoop.mapreduce.v2.app.MRAppMaster

Contents of mapred-site.xml : mapreduce.framework.name yarn yarn.app.mapreduce.am.env
CuriousCoder
  • 181
  • 1
  • 4
  • 11
6
votes
1 answer

java.net.ConnectException: Your endpoint configuration is wrong;

I am running word count program from my windows machine on hadoop cluster which is setup on remote linux machine. Program is running successfully and I am getting output but I am getting following exception and my waitForCompletion(true) is not…
CuriousCoder
  • 181
  • 1
  • 4
  • 11
6
votes
0 answers

how to integrate spark 2.2 with hadoop 3.1 manually?

I want to use Spark version 2.2 and Hadoop latest version 3.1. Can I integrate Spark and Hadoop manually? I have already installed Spark 2.2 with Hadoop 2.6 or later but I want to update Hadoop. Is it possible to find Hadoop directory in Spark with…
griez007
  • 420
  • 5
  • 9
6
votes
7 answers

Hadoop : start-dfs.sh Connection refused

I have a vagrant box on debian/stretch64 I try to install Hadoop3 with documentation http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SingleCluster.htm When I run start-dfs.sh I have this message vagrant@stretch:/opt/hadoop$…
Bob's Jellyfish
  • 313
  • 3
  • 8
5
votes
1 answer

Spark and Hive in Hadoop 3: Difference between metastore.catalog.default and spark.sql.catalogImplementation

I'm working on a Hadoop cluster (HDP) with Hadoop 3. Spark and Hive are also installed. Since Spark and Hive catalogs are separated, it's a bit confusing sometimes, to know how and where to save data in a Spark application. I know, that the property…
D. Müller
  • 2,956
  • 2
  • 26
  • 70
5
votes
1 answer

YARN FairScheduler configuration

Resource model in Hadoop 3 allows us to define custom resource types. I did some googling but couldn't find anything that would tell how can the YARN FairScheduler be configured to distribute/isolate these resources among its pools.
mazaneicha
  • 6,760
  • 4
  • 26
  • 42
4
votes
0 answers

Run Docker container through Oozie

I'm trying to build an Oozie workflow to execute everyday a python script which needs specific libraries to run. At the moment I created a python virtual environment (using venv) on a node of my cluster (consisting of 11 nodes). Through Oozie I saw…
AGL
  • 106
  • 1
  • 6
4
votes
1 answer

Pig is not running in mapreduce mood (hadoop 3.1.1 + pig 0.17.0)

I am very new to Hadoop. My hadoop version is 3.1.1 and pig version is 0.17.0. Everything is working as expected by running this script in local mode pig -x local grunt> student = LOAD '/home/ubuntu/sharif_data/student.txt' USING PigStorage(',') as…
sharif2008
  • 2,470
  • 1
  • 17
  • 32
4
votes
1 answer

"start-all.sh" and "start-dfs.sh" from master node do not start the slave node services?

I have updated the /conf/slaves file on the Hadoop master node with the hostnames of my slave nodes, but I'm not able to start the slaves from the master. I have to individually start the slaves, and then my 5-node cluster is up and running. How can…
ingmid
  • 51
  • 3
4
votes
2 answers

If I already have Hadoop installed, should I download Apache Spark WITH Hadoop or WITHOUT Hadoop?

I already have Hadoop 3.0.0 installed. Should I now install the with-hadoop or without-hadoop version of Apache Spark from this page? I am following this guide to get started with Apache Spark. It says Download the latest version of Apache Spark…
JBel
  • 189
  • 1
  • 1
  • 16
4
votes
1 answer

Error applying authorization policy on hive configuration: Couldn't create directory ${system:java.io.tmpdir}\${hive.session.id}_resources

I run Hadoop 3.0.0-alpha1 on windows and added Hive 2.1.1 to it. When I try to open the hive beeline with the hive command I get an error: Error applying authorization policy on hive configuration: Couldn't create directory…
Benvorth
  • 6,363
  • 7
  • 40
  • 64
3
votes
1 answer

Where would namenode and datanode be installed if not defined in hdfs-site.xml?

My hdfs-site.xml has ONLY the following: dfs.replication 1 Question. Where would the NameNode and DataNode be installed? I am using…
Gautam De
  • 41
  • 3
3
votes
3 answers

Hadoop-3.1.2: Datanode and Nodemanager shuts down

I am trying to install Hadoop(3.1.2) on Windows-10, but data node and node manager shuts down. I have tried downloading and placing the winutils.exe and hadoop.dll files under bin directory. I have also tried changing the permissions of the files…
Stuxen
  • 615
  • 6
  • 17
3
votes
1 answer

Hadoop/HDFS 3.1.1 (on Java 11) Web UI crash when loading the file explorer?

After start-dfs.sh, I can navigate to http://localhost:9870. The NameNode seems to be running just fine. Then I click on "Utilities -> Browse the file system" and I get this prompted in the web browser: Failed to retrieve data from…
Martin Andersson
  • 14,356
  • 8
  • 77
  • 106
1
2 3 4 5 6 7