Use for questions specific to Apache Hadoop 3.0 features (i.e. Erasure Coding, YARN Timeline Service v2, Opportunistic Containers, 3+ NameNode fault-tolerance). For general questions related to Apache Hadoop use the tag [hadoop].
Questions tagged [hadoop3]
99 questions
0
votes
0 answers
Hadoop trying to use JDK install directory as executable command
I am new to Hadoop, and attempting to get the first simple 'Word count' example to run.
I have the same problem that was reported here (but the responses there don't resolve the problem):
Could not run jar file in hadoop3.1.3
Java is installed to…
0
votes
0 answers
Apache spark 2.4 compatibility with hadoop 3.2 environment
we are trying to run spark2.4.x on hadoop 3.2.
but we have noticed that databricks only has prebuilt version for hadoop2.7/2.6 for spark2.4.x.
here is the download page link from official databricks site: https://spark.apache.org/downloads.html
so…
linehrr
- 1,268
- 13
- 20
0
votes
0 answers
How to build hive from source against Hadoop version 3.x with a custom JDBC version that supports connecting hiveserver version 2.x
How to build hive from source against Hadoop version 3.x with a custom JDBC version that supports connecting hiveserver version 2.x
I have looked at :
https://cwiki.apache.org/confluence/display/Hive/HiveDeveloperFAQ#HiveDeveloperFAQ-Building
As…
Cod_enthu
- 91
- 1
- 8
0
votes
1 answer
Hadoop3 balancer vs disk balancer
I read Hadoop ver 3 document about disk balancer and it said
"Diskbalancer is a command line tool that distributes data evenly on all disks of a datanode.
This tool is different from Balancer which takes care of cluster-wide data balancing."
I…
eyeballs
- 161
- 1
- 2
- 11
0
votes
0 answers
Hadoop web interface is not working even though the nodes are starting
I'm trying to install Hadoop v3.1.3 in pseudo-distributed mode in my Ubuntu 18.04 environment.
After following the documentation word-by-word, my web interface is still not working i.e. localhost:9870 yields no result. Log files are getting created…
Utsav
- 9
- 4
0
votes
0 answers
Hadoop Distcp CopyListing$DuplicateFileException issue
Running distcp job with distcpoptions below:-
optionsDistCpOptions{atomicCommit=false, syncFolder=true, deleteMissing=false, ignoreFailures=false, overwrite=false, append=false, useDiff=true, useRdiff=false, fromSnapshot=snap1, toSnapshot=snap2,…
0
votes
1 answer
Hadoop 3.2.1 localhost: ERROR: You must be a privileged user in order to run a secure service
I am trying to install a simple hadoop setup on Ubuntu 20 on windows WSL. I am able to get NameNode and Yarn running but the Datanodes is failing
Getting the following error while trying to start-dfs.sh
hadoopuser@mycompu:~/hadoop$…
virtuvious
- 1,954
- 2
- 15
- 15
0
votes
1 answer
Not able to create hive table using sqoop
I am trying below command to import the mysql table stocks to my hive(v3.1.2) in Ubuntu 18.0.4 and Hadoop 3 using sqoop(v1.4.7)
sqoop import --connect jdbc:mysql://localhost/myhadoop --username hiveuser --password xxx --table stocks --bindir…
suneesh
- 11
- 1
0
votes
0 answers
Hadoop YARN resource manager not able start due to error
I am trying to run Hadoop (HDFS and YARN) in multi-node cluster (2 nodes) but the resource manager fails to start on slave node. Basically, it fails due to the below exception - not able to find a class called javax.activation.DataSource (which is…
Learner
- 413
- 4
- 17
0
votes
1 answer
The sqoop is not working on my ubuntu 18.04 with hadoop 3.1.3
I am getting below error in my Ubutnttu(18.0.4) machine while launching sqoop(1.4.7,Hadoop-3.1.3)
command used:
sqoop import --connect jdbc:mysql://localhost/myhadoop --username hiveuser --password xxxx --table employee --split-by --target-dir…
suneesh
- 11
- 1
0
votes
1 answer
Hadoop Client unable to connect to datanode
I have single node hadoop cluster on ec2. Tried to give all posible combinations in slaves file.
May 01 2020 08:16:25.227 DEBUG org.apache.hadoop.hdfs.DFSClient - pipeline = 172.31.45.114:9866
May 01 2020 08:16:25.227 DEBUG…
Vishnu Gopal Singhal
- 91
- 6
0
votes
0 answers
Could not find or load main class hdfs problem
I am trying to use Apache Rya for some tests (https://rya.apache.org/).
For those who are familiar with Rya and RDF stores, I am trying to do a bulk loading which is explained here:…
M.Taki_Eddine
- 150
- 1
- 10
0
votes
0 answers
ERROR: datanode can only be executed by harry
I want to start all (namenode and datanode) but when I used this command start-all.sh it returned:
ERROR: datanode can only be executed by harry
How to fix this?
Key Jun
- 311
- 4
- 15
0
votes
1 answer
Secondary Name Node in Hadoop
Suppose for checkpoint default time is 1hr.
If Name Node goes down after 55min from last checkpoint.
We loss the last 55 min data(edit log file data is not added in fsImage)?
0
votes
1 answer
Dask - trying to read hdfs data getting error ArrowIOError: HDFS file does not exist
I tried creating a dataframe from csv stored in hdfs. Connecting is successful. But when trying to get output of len function getting error.
Code:
from dask_yarn import YarnCluster
from dask.distributed import Client, LocalCluster
import…
Nirmal Ram
- 1,155
- 2
- 8
- 17