5

I'm using emr-5.4.0 with Spark 2.1.0. I understand what NullPointerException is, this question is about why that was thrown in this particular case.

Cannot really figure out why I got NullPointerException in the driver thread.

I got this weird job failing with this error:

18/03/29 20:07:52 INFO ApplicationMaster: Starting the user application in a separate Thread
18/03/29 20:07:52 INFO ApplicationMaster: Waiting for spark context initialization...
Exception in thread "Driver" java.lang.NullPointerException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:637)
18/03/29 20:07:52 ERROR ApplicationMaster: Uncaught exception:
java.lang.IllegalStateException: SparkContext is null but app is still running!
    at org.apache.spark.deploy.yarn.ApplicationMaster.runDriver(ApplicationMaster.scala:415)
    at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:254)
    at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$main$1.apply$mcV$sp(ApplicationMaster.scala:766)
    at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:67)
    at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:66)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
    at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:66)
    at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:764)
    at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)
18/03/29 20:07:52 INFO ApplicationMaster: Final app status: FAILED, exitCode: 10, (reason: Uncaught exception: java.lang.IllegalStateException: SparkContext is null but app is still running!)
18/03/29 20:07:52 INFO ApplicationMaster: Unregistering ApplicationMaster with FAILED (diag message: Uncaught exception: java.lang.IllegalStateException: SparkContext is null but app is still running!)
18/03/29 20:07:52 INFO ApplicationMaster: Deleting staging directory hdfs://<ip-address>.ec2.internal:8020/user/hadoop/.sparkStaging/application_1522348295743_0010
18/03/29 20:07:52 INFO ShutdownHookManager: Shutdown hook called
End of LogType:stderr

I submitted this job as this:

spark-submit --deploy-mode cluster --master yarn --num-executors 40 --executor-cores 16 --executor-memory 100g --driver-cores 8 --driver-memory 100g --class <package.class_name> --jars <s3://s3_path/some_lib.jar> <s3://s3_path/class.jar>

And my class looks like this:

class MyClass {

  def main(args: Array[String]): Unit = {
    val c = new MyClass()
    c.process()
  }

  def process(): Unit = {
    val sparkConf = new SparkConf().setAppName("my-test")
    val sparkSession: SparkSession = SparkSession.builder().config(sparkConf).getOrCreate()
    import sparkSession.implicits._
    ....
  }

  ...
}
Suanmeiguo
  • 1,105
  • 2
  • 13
  • 25
  • Possible duplicate of [What is a NullPointerException, and how do I fix it?](https://stackoverflow.com/questions/218384/what-is-a-nullpointerexception-and-how-do-i-fix-it) – Johannes Kuhn Mar 30 '18 at 08:07
  • 2
    This can be a dupe but since @JacekLaskowski gave a new different approach very specific to [apache-spark]. I'm not closing this. – eliasah Apr 17 '18 at 06:59

1 Answers1

12

Change class MyClass to object MyClass and you're done.

While we're at it, I'd also change class MyClass to object MyClass extends App and remove def main(args: Array[String]): Unit (as given by extends App).

I've reported an improvement for Spark 2.3.0 - [SPARK-23830] Spark on YARN in cluster deploy mode fail with NullPointerException when a Spark application is a Scala class not object - to have it reported nicely to an end user.


Digging deeper into how Spark on YARN works, the following message is when the ApplicationMaster of a Spark application starts the driver (you used --deploy-mode cluster --master yarn with spark-submit).

ApplicationMaster: Starting the user application in a separate Thread

Right after the INFO message you should see another:

ApplicationMaster: Waiting for spark context initialization...

This is part of the driver initialization when the ApplicationMaster runs.

The reason for the exception Exception in thread "Driver" java.lang.NullPointerException is due to the following code:

val mainMethod = userClassLoader.loadClass(args.userClass)
  .getMethod("main", classOf[Array[String]])

My understanding is that mainMethod is null at this point so the following line (where mainMethod is null) "triggers" NullPointerException:

mainMethod.invoke(null, userArgs.toArray)

The thread is indeed called Driver (as in Exception in thread "Driver" java.lang.NullPointerException) as set in this line:

userThread.setContextClassLoader(userClassLoader)
userThread.setName("Driver")
userThread.start()

The line numbers differ since I used Spark 2.3.0 to reference the lines while you use emr-5.4.0 with Spark 2.1.0.

Jacek Laskowski
  • 64,943
  • 20
  • 207
  • 364
  • 1
    thanks, there are several pitfalls when launching the process, which can lead to such NPEs. Like not including the assembly jar of the task you are launching. – fracca Jun 18 '19 at 11:59