0

I am looking for a way to handle the data type conversion dynamically. SparkDataframes , i am loading the data into a Dataframe using a hive SQL and storing into dataframe and then writing to a parquet file. Hive is unable to read some of the data types and i wanted to convert the decimal datatypes to Double . Instead of specifying a each column name separately Is there any way we can dynamically handle the datatype. Lets say in my dataframe i have 50 columns out of 8 are decimals and need to convert all 8 of them to Double datatype Without specify a column name. can we do that directly?

Srinivas Bandaru
  • 140
  • 3
  • 15

2 Answers2

0

There is no direct way to do this convert data type here are some ways,

Either you have to cast those columns in hive query .

or

Create /user case class of data types you required and populate data and use it to generate parquet.

or

you can read data type from hive query meta and use dynamic code to get case one or case two to get. achieved

sandeep rawat
  • 4,261
  • 1
  • 14
  • 34
0

There are two options:
1. Use the schema from the dataframe and dynamically generate query statement
2. Use the create table...select * option with spark sql

This is already answered and this post has details, with code.

ganeiy
  • 292
  • 2
  • 9