-1

I am having some column entries(string) in my cassandra which is having '\n' inside like:

(id,name,age) values(1,'abc\nxyz',28)

Now I am using spark to write my rows in a csv file but spark is taking '\n' character as a new line

val cass= spark.read.format("org.apache.spark.sql.cassandra").option("keyspace","mykeyspace").option("table","mytable").load

cass.write.csv("abc.csv")

id|name|age
1|abc
xyz|28
2|gfgdd|32 

Is there any way I can ignore or replace '\n' with blank space while writing, so that output will be

id|name|age
1|abcxyz or abc xyz|28
2|gfgdd|32
Alex Ott
  • 49,058
  • 5
  • 62
  • 91
Vicky
  • 883
  • 13
  • 28

1 Answers1

2

Just use functions.regexp_replace to replace next line characters with space as below

import org.apache.spark.sql.functions

object ReplaceNextLine {


  def main(args: Array[String]): Unit = {

    val spark = Constant.getSparkSess

    import spark.implicits._
    val df = List((1,"anc\nxyz",28)).toDF("id","name","age").toDF
        .withColumn("name",functions.regexp_replace(functions.col("name"),"\n"," "))

    df.show()

  }

}

QuickSilver
  • 3,613
  • 2
  • 9
  • 26