13

I'm going completely crazy:

Installed Hadoop/Hbase, all is running;

/opt/jdk1.6.0_24/bin/jps
23261 ThriftServer
22582 QuorumPeerMain
21969 NameNode
23500 Jps
23021 HRegionServer
22211 TaskTracker
22891 HMaster
22117 SecondaryNameNode
21779 DataNode
22370 Main
22704 JobTracker

Pseudo distributed environment.

hbase shell

is working and coming up with correct results running 'list' and;

hbase shell
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 0.90.1-cdh3u0, r, Fri Mar 25 16:10:51 PDT 2011

hbase(main):001:0> status
1 servers, 0 dead, 8.0000 average load

When connecting via ruby & thrift, everything is working fine; we are adding data, it's getting in the system, we can query/scan it. Everything seems fine.

However, when connecting with Java:

groovy> import org.apache.hadoop.hbase.HBaseConfiguration 
groovy> import org.apache.hadoop.hbase.client.HBaseAdmin 
groovy> conf = HBaseConfiguration.create() 
groovy> conf.set("hbase.master","127.0.0.1:60000"); 
groovy> hbase = new HBaseAdmin(conf); 

Exception thrown

org.apache.hadoop.hbase.ZooKeeperConnectionException: org.apache.hadoop.hbase.ZooKeeperConnectionException: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase
    at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getZooKeeperWatcher(HConnectionManager.java:1000)
    at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.setupZookeeperTrackers(HConnectionManager.java:303)
    at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.<init>(HConnectionManager.java:294)
    at org.apache.hadoop.hbase.client.HConnectionManager.getConnection(HConnectionManager.java:156)
    at org.apache.hadoop.hbase.client.HBaseAdmin.<init>(HBaseAdmin.java:84)

I've been trying to find the cause, but I really have no clue at all. Everything seems to be correctly installed.

netstat -lnp|grep 60000
tcp6       0      0 :::60000                :::*                    LISTEN      22891/java  

Looks fine as well.

# telnet localhost 60000
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.

Connects and dies if you type anything + enter (not sure if that's the idea, thrift on 9090 does the same).

Can anyone help me?

CharlesS
  • 1,252
  • 1
  • 14
  • 30

5 Answers5

9

This is a Zookeeper(ZK) error. The HBase client tries to get the /hbase node from Zookeeper and fails.

You can get a ZK dump from the HBase master web interface. You should see all the connections to ZK and figure out if something is exhausting them.

Before diving into anything else you could try restarting your ZK cluster and see if it fixes your problem. (It's strange that you see that with a single client).

HBase has a setting to increase the number of connections to ZK. It's

hbase.zookeeper.property.maxClientCnxns

There were a few updates (see below) lately related to the default number of connections (there's a hbase-default.xml file that has all the default configurations). You can override this in your hbase-site.xml file (under HBase conf dir) and raise it to 100 or more. But make sure you're not masking the real problem this way, you shouldn't see this problem with a single client.

We've had a similar situation, but it was happening during heavy operations from map-reduce jobs, after upgrading to HBase-0.90.

Here are a couple of issue related to your problem:

If you still can't figure it out send an email to the hbase-users list or join the #hbase channel on freenode and ask live questions.

Nevermore
  • 6,444
  • 4
  • 32
  • 53
Cosmin Lehene
  • 558
  • 4
  • 11
3

This happens when user has an incorrect value defined for "zookeeper.znode.parent" in the hbase-site.xml sourced on the client side or in case of a custom API written , the "zookeeper.znode.parent" was incorrectly updated to a wrong location . For example the default "zookeeper.znode.parent" is set to "/hbase-unsecure" , but if you incorrectly specify that as lets say "/hbase" as opposed to what we have set up in the cluster, we will encounter this exception while trying to connect to the HBase cluster

3

Step 1: First will check the HBase Master node is running or not by using "jps" commands.

Step 2: using "stop-all.sh" command to stop the all running services on Hadoop cluster.

For more inofrmation about this issue:

http://commandstech.com/hbase-error-keeperrorcode-connectionloss-for-hbase-in-cluster/

Step 3: using "start-all.sh" command to start all running services.

Step 4: using "jps" command to check the services if it showing HBase master working then fine otherwise will do below steps:

Step 5: Goto root user using "sudo su"

Step 6: Goto hbase shell file path: "cd /usr/lib/habse-1.2.6-hadoop/bin/start-hbase.sh"

Step 7: Open the hbase shell using "hbase shell" command

Step 8: use "list" command.

Spandana r
  • 125
  • 1
  • 2
3

The problem was actually that (for some reason ... I don't really get it in detail) the firewall was blocking one of the ports required to talk to Zookeeper; from the command line it worked, from my app it didn't. However when I disabled the firewall all worked fine all of a sudden.

Thank you for your help!

CharlesS
  • 1,252
  • 1
  • 14
  • 30
1

I had the same issue connecting to my hbase db.

Turns out I had a bad address of the db machine in my /etc/hosts.

Blorgbeard
  • 93,378
  • 43
  • 217
  • 263