I am trying to run spark streaming job on EMR with Kinesis. Spark 1.6.1 with Kinesis ASL 1.6.1. Writing a plain sample wordcount example.
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-kinesis-asl_2.10</artifactId>
<version>1.6.1</version>
</dependency>
<dependency>
<groupId>com.amazonaws</groupId>
<artifactId>amazon-kinesis-client</artifactId>
<version>1.6.3</version>
</dependency>
<dependency>
<groupId>com.amazonaws</groupId>
<artifactId>amazon-kinesis-producer</artifactId>
<version>0.10.2</version>
</dependency>
This throws following exception
java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.NoClassDefFoundError: com/google/protobuf/ProtocolStringList
at com.amazonaws.services.kinesis.clientlibrary.lib.worker.ShardConsumer.checkAndSubmitNextTask(ShardConsumer.java:157)
at com.amazonaws.services.kinesis.clientlibrary.lib.worker.ShardConsumer.consumeShard(ShardConsumer.java:126)
Upgrading to 2.0.0-preview
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-kinesis-asl_2.10</artifactId>
<version>2.0.0-preview</version>
</dependency>
gives following exception
java.lang.NoClassDefFoundError: org/apache/spark/internal/Logging
at org.apache.spark.streaming.kinesis.KinesisUtils$$anonfun$createStream$1.apply(KinesisUtils.scala:74)