I have a sample Python class
class bean :
def __init__(self, puid, pogid, bucketId, dt, at) :
self.puid = puid
self.pogid = pogid
self.bucketId = bucketId
self.dt = (datetime.datetime.today() - datetime.datetime.strptime(dt, "%Y-%m-%d %H:%M:%S")).days
self.absdt=dt
self.at = at
Now i know that in Java to make a class serializable we just have to extend Serializable and ovverride a few methods and life is simple. Though Python is so simplistic yet i cant find a way to serialize the objects of this class.
This class should be serializable over network because the objects of this call goes to apache spark which distributes the object over network.
What is the best way to do that.
I also found this but dont know if it is the best way to do it.
I also read
Classes, functions, and methods cannot be pickled -- if you pickle an object, the object's class is not pickled, just a string that identifies what class it belongs to.
So does that mean those classes cant be serialized ?
PS: There would be millions of object of this class as the data is huge. So please provide 2 solution one the easiest and other the most efficient way of doing so.
EDIT :
For clarification i have to use this something like
def myfun():
**Some Logic **
t1 = bean(<params>)
t2 = bean(<params2>)
temp = list()
temp.append(t1)
temp.append(t2)
return temp
Now how it is finally called
PairRDD.map(myfun).collect()
which throws exception
<function __init__ at 0x7f3549853c80> is not JSON serializable