I have a SparkContext sc
with a highly customised SparkConf(). How do I use that SparkContext to create a SparkSession? I found this post: https://stackoverflow.com/a/53633430/201657 that shows how to do it using Scala:
val spark = SparkSession.builder.config(sc.getConf).getOrCreate()
but when I try and apply the same technique using PySpark:
from pyspark.sql import SparkSession
spark = SparkSession.builder.config(sc.getConf()).enableHiveSupport().getOrCreate()
It fails with error
AttributeError: 'SparkConf' object has no attribute '_get_object_id'
As I say I want to use the same SparkConf
in my SparkSession as used in the SparkContext. How do I do it?
UPDATE
I've done a bit of fiddling about:
from pyspark.sql import SparkSession
spark = SparkSession.builder.enableHiveSupport().getOrCreate()
sc.getConf().getAll() == spark.sparkContext.getConf().getAll()
returns
True
so the SparkConf of both the SparkContext & the SparkSession are the same. My assumption from this is that SparkSession.builder.getOrCreate()
will use an existing SparkContext if it exists. Am I correct?