17

I have installed Zeppelin 0.7.1. When I tried to execute the Example spark program(which was available with Zeppelin Tutorial notebook), I am getting the following error

java.lang.NullPointerException
    at org.apache.zeppelin.spark.Utils.invokeMethod(Utils.java:38)
    at org.apache.zeppelin.spark.Utils.invokeMethod(Utils.java:33)
    at org.apache.zeppelin.spark.SparkInterpreter.createSparkContext_2(SparkInterpreter.java:391)
    at org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:380)
    at org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:146)
    at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:828)
    at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:70)
    at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:483)
    at org.apache.zeppelin.scheduler.Job.run(Job.java:175)
    at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

I have also setup the config file(zeppelin-env.sh) to point to my Spark installation & Hadoop configuration directory

export SPARK_HOME="/${homedir}/sk"
export HADOOP_CONF_DIR="/${homedir}/hp/etc/hadoop"

The Spark version I am using is 2.1.0 & Hadoop is 2.7.3

Also I am using the default Spark Interpreter Configuration(so Spark is set to run in Local mode)

Am I missing something here?

PS : I am able to connect to spark from the Terminal using spark-shell

Raj
  • 2,185
  • 5
  • 27
  • 46

9 Answers9

12

Just now I got solution of this issue for Zeppelin-0.7.2:

Root Cause is : Spark trying to setup Hive context, but hdfs services is not running, that's why HiveContext become null and throwing null pointer exception.

Solution:
1. Setup Saprk Home [optional] and HDFS.
2. Run HDFS service
3. Restart zeppelin server
OR
1. Go to Zeppelin's Interpreter settings.
2. Select Spark Interpreter
3. zeppelin.spark.useHiveContext = false

Vega
  • 23,736
  • 20
  • 78
  • 88
Rajeev Rathor
  • 1,360
  • 19
  • 17
  • The HiveContext did it for me as well! – Benjamin Baron Jan 30 '18 at 14:20
  • 1
    Dear @RajeevRathor and @BenjaminBaron, I don't intend to be rude but I'm sure upvoting the answer serves the same purpose as writing those comments. When you hover over `add a comment`, the popup says `.. Avoid comments like "+1" or "thanks".` FYI, this solution didn't work for me. – y2k-shubham Jan 31 '18 at 06:51
9

Finally, I am able to find out the reason. When I checked the logs in ZL_HOME/logs directory, find out it seems to be the Spark Driver binding error. Added the following property in Spark Interpreter Binding and works good now...

enter image description here

PS : Looks like this issue comes up mainly if you connect to VPN...and I do connect to VPN

Raj
  • 2,185
  • 5
  • 27
  • 46
  • I too saw this problem appear on Zeppelin 0.8 running on a VM; everything was working fine until one of the VM's (unrelated) network adapters changed address, and only a restart got Zeppelin back to work... – jmng May 27 '19 at 15:19
2

Did you set right SPARK_HOME? Just wondered what sk is in your export SPARK_HOME="/${homedir}/sk"

(I just wanted to comment below your question but couldn't, due to my lack of reputation)

  • yes, that is where I have installed Spark :). that is the Home directory of Spark installation – Raj Apr 09 '17 at 02:51
0

solved it by adding this line at the top in file common.sh in dir zeppelin-0.6.1 then bin

open common.sh and add command in the top of file set :

unset CLASSPATH

Mmagdy
  • 13
  • 4
0
    enterCaused by: java.net.ConnectException: Connection refused (Connection refused)
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
        at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.net.Socket.connect(Socket.java:589)
        at org.apache.thrift.transport.TSocket.open(TSocket.java:182)
        ... 74 more
)
        at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:466)
        at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:236)
        at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74)
        ... 71 more
 INFO [2017-11-20 17:51:55,288] ({pool-2-thread-4} SparkInterpreter.java[createSparkSession]:369) - Created Spark session with Hive support
ERROR [2017-11-20 17:51:55,290] ({pool-2-thread-4} Job.java[run]:181) - Job failed code here

It looks like Hive Metastore service not started. You can start the Metastore service and try again.

hive --service metastore
Joshua D. Boyd
  • 4,508
  • 2
  • 26
  • 44
0

I was getting the exactly same exception for zepelline 0.7.2 version on window 7. I had to do multiple changes into the configuration to make it work.

First rename the zeppelin-env.cmd.template to zeppelin-env.cmd. Add the env variable for PYTHONPATH. The file can be located at %ZEPPELIN_HOME%/conf folder.

set PYTHONPATH=%SPARK_HOME%\python;%SPARK_HOME%\python\lib\py4j-0.10.4-src.zip;%SPARK_HOME%\python\lib\pyspark.zip

Open the zeppelin.cmd from location %ZEPPELIN_HOME%/bin to add a %SPARK_HOME% and %ZEPPELIN_HOME%. Those will be the first lines in the instruction. The value for %SPARK_HOME% was configured as blank as I was using the embedded spark library.I added %ZEPPELIN_HOME% to make sure this env is configured at the initial stage of startup.

set SPARK_HOME=
set ZEPPELIN_HOME=<PATH to zeppelin installed folder>

Next we will have to copy all the jar and pySpark from the %spark_home%/ to zeppeline folder.

cp %SPARK_HOME%/jar/*.jar %ZEPPELIN_HOME%/interpreter/spark
cp %SPARK_HOME%/python/pyspark %ZEPPELIN_HOME%/interpreter/spark/pyspark

I wasn't starting the interpreter.cmd while accessing the notebook. This was causing the nullpointer exception. I opened two command prompt and in one cmd I started zeppeline.cmd and in the other interpreter.cmd.

We have to specify two additional input port and path to zeppeline local_repo in command line. You can get the path to local_repo in zeppeline spark interpreter page. Use exactly same path to start the interpreter.cmd.

interpreter.cmd  -d %ZEPPELIN_HOME%\interpreter\spark\ -p 5050  -l %ZEPPELIN_HOME%\local-repo\2D64VMYZE

The host and port needs to be specified in the spark interpreter page in zepelline ui. Select the Connect to external Process

HOST : localhost
PORT : 5050

Once all these on configuration are created, on next step we can save and restart the spark interpreter. Create a new notebook and type sc.version. It will publish the spark version. Zeppeline 0.7.2 doesn't support spark 2.2.1

Soumyajit Swain
  • 1,180
  • 16
  • 31
0

On AWS EMR the issue was memory. I had to manually set lower value for spark.executor.memory in the Interpeter for Spark using the UI of Zeppelin.

The value varies based on your instance size. The best is to check the logs located in the /mnt/var/log/zeppelin/ folder.

In my case the underlying error was:

Error initializing SparkContext.
java.lang.IllegalArgumentException: Required executor memory (6144+614 MB) is above the max threshold (6144 MB) of this cluster! Please check the values of 'yarn.scheduler.maximum-allocation-mb' and/or 'yarn.nodemanager.resource.memory-mb'.

That helped me understand why it was failing and what I can do to fix it.

Note:

This happened because I was starting an instance with HBase which limits the available memory. See the defaults for instance size here.

Dusan Vasiljevic
  • 662
  • 6
  • 18
  • I downvoted because while this may be useful, this error is not part of the stacktrace and it is not part of the question. – Marc-Olivier Titeux Jan 14 '19 at 17:21
  • @Marc-OlivierTiteux I beg to differ. The `NullPointerException` does happen when you run the application on Zeppelin. Namely, that is the exception you get on the front end. When you dig into logs, as I did, you can see that `NullPointerException` was masking the `IllegalArgumentException` I mentioned above. If you are having the same issue as a poster, you should check the logs in the path above. – Dusan Vasiljevic Jan 14 '19 at 21:14
  • very different in my case. A custom package was installed on the cluster and triggered the error. I am not saying you have the same pattern. I am saying that the answer does not match the stacktrace in the OP did not have this. – Marc-Olivier Titeux Jan 16 '19 at 08:26
  • 1
    @Marc-OlivierTiteux It did match. The `NullPointerException` on the frontend can be caused by *multiple* other issues, one of which is the one I had. To reiterate: I had the **exact same stacktrace as the poster**, but when you dig into logs, you can see that `NullPointerException` was caused by other service failing with `IllegalArgumentException` where the `Zeppelin` expected the result to not be `null`. You are basically penalising me because your `NullPointerException` wasn't solved by my answer. The OP didn't accept any of the answers, so should they all get negative points? – Dusan Vasiljevic Jan 18 '19 at 00:20
0

Check if your NameNode have gone in safe mode.

check with below syntax:

sudo -u hdfs hdfs dfsadmin -safemode get

to leave from safe mode use below command:

sudo -u hdfs hdfs dfsadmin -safemode leave
-2

Seems to be bug in Zeppelin 0.7.1. Works fine in 0.7.2.

Walker Rowe
  • 791
  • 9
  • 18