I have a big dataframe and trying to save it as a HIVE table using the following command.
df.write.options(Map("path" -> "/workspace/ny/df")).saveAsTable("db_name.table_name")
I'm allocating resources statically, but saveAsTable
fails because of out of memory issue. I read in one of the answers to this post that saveAsTable
is like persist into memory. Is that correct? How can I create external tables then?
My other question is, even even my table is not big, saveAsTable takes long and I read here that it's because of updating some stats. Is that also correct? What sorts of stats are being calculated and how can I turn them off?
p.s. my tables are in Parquet.