2

I have pandas dataframes which I convert to spark dataframes. The problem is that I do not know the schema of those pandas dataframes... it can be any dataframe. So it appears that it is possible that some of those pandas dataframes containing columns of types like numpy.float64 for example. An autoconversion to a python native type is not possible:

not supported type: <type 'numpy.float64'>

So before I convert my pandas dataframes to spark dataframes I have to make sure that all unsupported types are manually converted to the nearest equivalent. Other examples for numpy datatypes which do not have an equivalent in python are:

- numpy.float32
- numpy.float64
- numpy.uint32
- numpy.int16
- float128
- longfloat
- clongdouble
- clongfloat
- etc.

So I need a functionality which converts all those datatypes to the nearest equivalents withouth having to know the structure or the datatypes of my pandas dataframes.

How can I do that?

Mulgard
  • 8,305
  • 26
  • 110
  • 205
  • 1
    Does this help? https://stackoverflow.com/questions/9452775/converting-numpy-dtypes-to-native-python-types – cs95 Oct 30 '17 at 11:41

0 Answers0