Questions tagged [spark-cassandra-connector]

Connects Apache Spark and Cassandra for clustered queries

Summary

Get lightning-fast cluster computing with Spark and Cassandra. This library lets you expose Cassandra tables as Spark RDDs, write Spark RDDs to Cassandra tables, and execute arbitrary CQL queries in your Spark applications.

Links

839 questions
-1
votes
1 answer

Spark find previous value on each iteration of RDD

I've following code :- val rdd = sc.cassandraTable("db", "table").select("id", "date", "gpsdt").where("id=? and date=? and gpsdt>? and gpsdt row.get[String]("gpsdt"),…
jAi
  • 115
  • 1
  • 12
-1
votes
1 answer

java.lang.ClassNotFoundException: com.datastax.spark.connector.japi.rdd.CassandraTableScanJavaRDD

I am using Kafka Spark Streaming along with cassandra. When I am running my java class using eclipse it works fine but when I build using maven and execute it on Spark Shell it is throwing below mentioned exception. java.lang.NoClassDefFoundError:…
-1
votes
2 answers

Read blob types from cassandra in spark with spark-cassandra-connector

I need to read cassandra blob types in spark with spark-cassandra-connector and compare two datasetes based on blob field. As example following code shows my mean: // Cassandra Table CREATE TABLE keyspace.test ( id bigint, info blob, PRIMARY…
Moein Hosseini
  • 4,053
  • 14
  • 61
  • 98
-1
votes
2 answers

can not saveToCassandra in Spark

I want to take files from hdfs and save to cassandra import org.apache.spark.{SparkConf, SparkContext} import com.datastax.spark.connector._ val conf = new SparkConf().setMaster("local[2]).setAppName("test") .set("spark.cassandra.connection.host",…
-1
votes
1 answer

Cassandra Batch InvalidQueryException - Batch too large

I am inserting data into Cassandra using Batch. I am getting below exception when i run the job. caused by: com.datastax.driver.core.exceptions.InvalidQueryException: Batch too large at…
-1
votes
1 answer

Spark Cassandra Write Performance

i am trying to load around 2 million records to cassandra through spark. Spark has 4 executors and cassandra has 4 nodes in the cluster. But it takes around 20 mins to save all the data to cassandra. Can anyone please help me to make this thing bit…
-2
votes
2 answers

Running multiple cassandra queries and getting result in one RDD in Spark 2.3

I have a Sequence of Strings which I want to use in the where clause of my Cassandra queries. So there would be one query for each String in the Sequence. idSeq.foreach(id => { val rdd1 = sc.cassandraTable("keyspace", "columnfamily"). where("id…
-2
votes
1 answer

Scala task not serializable

I've a following code :- case class event(imei: String, date: String, gpsdt: String, entrygpsdt: String,lastgpsdt: String) object recalculate extends Serializable { def main(args: Array[String]) { val conf = new SparkConf() …
-2
votes
1 answer

How to insert DataSet ds into cassandra with Java API

Sample Code needed for Spark Cassandra Connector 2.11-2.0.5, Unable to Insert Dataset into cassandra db directly
-2
votes
2 answers

How to access SparkContext on executors to save DataFrame to Cassandra?

How can I use SparkContext (to create SparkSession or Cassandra Sessions) on executors? If I pass it as a parameter to the foreach or foreachPartition, then it will have a null value. Shall I create a new SparkContext in each executor? What I'm…
-2
votes
1 answer

Scala Spark Filter RDD using Cassandra

I am new to spark-Cassandra and Scala. I have an existing RDD. let say: ((url_hash, url, created_timestamp )). I want to filter this RDD based on url_hash. If url_hash exists in the Cassandra table then I want to filter it out from the RDD so I can…
-3
votes
2 answers

How to write to Cassandra using foreachBatch() in Java Spark?

I have the following code and i would like to write into cassandra using spark 2.4 structured streaming foreachBatch Dataset df = spark .readStream() .format("kafka") …
-4
votes
1 answer

Auto increment primary key when using saveToCassandra()

Is it possible to create auto increment primary key in table Cassandra?
-5
votes
1 answer

Cassandra's CqlInputFormat failed to built in Scala but worked for Java

My Spark scala code is like: val input = sc.newAPIHadoopRDD(jconf, classOf[CqlInputFormat], classOf[LongWritable], classOf[Row]) The CqlInputFormat class is implemented in Cassandra's source code. I tried to convert it to Java codes and it worked.…
1 2 3
55
56