0

I have a big dataframe and trying to save it as a HIVE table using the following command.

df.write.options(Map("path" -> "/workspace/ny/df")).saveAsTable("db_name.table_name") 

I'm allocating resources statically, but saveAsTable fails because of out of memory issue. I read in one of the answers to this post that saveAsTable is like persist into memory. Is that correct? How can I create external tables then?

My other question is, even even my table is not big, saveAsTable takes long and I read here that it's because of updating some stats. Is that also correct? What sorts of stats are being calculated and how can I turn them off?

p.s. my tables are in Parquet.

Community
  • 1
  • 1
HHH
  • 4,945
  • 14
  • 76
  • 138

0 Answers0