6

I have these configuration set for Spark but every time either I am reading from or writing to Cassandra table I am getting ioException

          .setMaster(sparkIp)
          .set("spark.cassandra.connection.host", cassandraIp)
          .set("spark.sql.crossJoin.enabled", "true")
          .set("spark.executor.memory", sparkExecutorMemory) //**26 GB**
          .set("spark.executor.cores", sparkExecutorCore) // **from 4 to 8**
          .set("spark.executor.instances", sparkExecutorInstances) // 1
          .set("spark.cassandra.output.batch.size.bytes", "2048")
          .set("spark.sql.broadcastTimeout", "2000")
          .set("spark.sql.shuffle.partitions", "1000")
          .set("spark.network.timeout", "80s")
          .set("spark.executor.extraJavaOptions", "-verbose:gc -XX:+UseG1GC")

sc.cassandraTableMyCaseClass //reading code

dataRDD..saveToCassandra("myDatabase", "mytable")// writing code

Data volume is large in the tables and operations are complex also.

I am using spark master with 28gb of memory and 8 cores and 10 spark workers with same configurations out of which I am using 26 gb of memory and cores from 4 to 8. sometimes I am also getting ExecutorLostException.

Latest StackTrace while writing the data in Cassandra table

org.apache.spark.SparkException: Job aborted due to stage failure: Task 145 in stage 6.0 failed 4 times, most recent failure: Lost task 145.6 in stage 6.0 (TID 3268, 10.178.149.48): ExecutorLostFailure (executor 157 exited caused by one of the running tasks) Reason: Executor heartbeat timed out after 118434 ms
Driver stacktrace:
    at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1450)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1438)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1437)
    at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
    at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
    at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1437)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:811)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:811)
    at scala.Option.foreach(Option.scala:257)
    at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:811)
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1659)
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1618)
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1607)
    at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)

Thank you in advance

Akash Sethi
  • 2,088
  • 1
  • 14
  • 36
  • What is the trace you are getting, also what is the error in the executor logs? – RussS Feb 22 '17 at 16:25
  • @RussS just updated stacktrace and on executor logs its Connection reset by peer dint get info as i am also printing the GC stack – Akash Sethi Feb 23 '17 at 04:47

0 Answers0