Convert numpy type to python

Question

I have a list of dicts in the following form that I generate from pandas. I want to convert it to a json format.

list_val = [{1.0: 685}, {2.0: 8}]
output = json.dumps(list_val)

However, json.dumps throws an error: TypeError: 685 is not JSON serializable

I am guessing it's a type conversion issue from numpy to python(?).

However, when I convert the values v of each dict in the array using np.int32(v) it still throws the error.

EDIT: Here's the full code

            new = df[df[label] == label_new] 
            ks_dict = json.loads(content)
            ks_list = ks_dict['variables']
            freq_counts = []

            for ks_var in ks_list:

                    freq_var = dict()
                    freq_var["name"] = ks_var["name"]
                    ks_series = new[ks_var["name"]]
                    temp_df = ks_series.value_counts().to_dict()
                    freq_var["new"] = [{u: np.int32(v)} for (u, v) in temp_df.iteritems()]            
                    freq_counts.append(freq_var)

           out = json.dumps(freq_counts)

your code works fine for me... (Python 3.4.2) - `[{"1.0": 685}, {"2.0": 8}]` — MattDMo, Nov 20 '14 at 21:44
Yes, it's generated from a DataFrame. I'll update the full code in the post — ubh, Nov 20 '14 at 21:49
So… is there a reason you're putting `np.int32(v)` instead of `v` (or `int(v)`; not sure what `v` is) in `freq_var`? — abarnert, Nov 20 '14 at 22:04
Also, when you have problems like this in the future, try looking at first the `repr` and then the `type` of each object, not just printing out their `str`. (And include the results in your question.) It's a lot easier to just know you have an `np.float32` or whatever than to have to guess that maybe there's some kind of type conversion issue. — abarnert, Nov 20 '14 at 22:05
I used int32 assuming that would resolve the Type Error. However, when I changed np.int32(v) to np.int(v),it worked. — ubh, Nov 24 '14 at 00:06

score 109 · Accepted Answer · edited Feb 21 '18 at 08:00

It looks like you're correct:

>>> import numpy
>>> import json
>>> json.dumps(numpy.int32(685))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.7/json/__init__.py", line 243, in dumps
    return _default_encoder.encode(obj)
  File "/usr/lib/python2.7/json/encoder.py", line 207, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/usr/lib/python2.7/json/encoder.py", line 270, in iterencode
    return _iterencode(o, 0)
  File "/usr/lib/python2.7/json/encoder.py", line 184, in default
    raise TypeError(repr(o) + " is not JSON serializable")
TypeError: 685 is not JSON serializable

The unfortunate thing here is that numpy numbers' __repr__ doesn't give you any hint about what type they are. They're running around masquerading as ints when they aren't (gasp). Ultimately, it looks like json is telling you that an int isn't serializable, but really, it's telling you that this particular np.int32 (or whatever type you actually have) isn't serializable. (No real surprise there -- No np.int32 is serializable). This is also why the dict that you inevitably printed before passing it to json.dumps looks like it just has integers in it as well.

The easiest workaround here is probably to write your own serializer¹:

class MyEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, numpy.integer):
            return int(obj)
        elif isinstance(obj, numpy.floating):
            return float(obj)
        elif isinstance(obj, numpy.ndarray):
            return obj.tolist()
        else:
            return super(MyEncoder, self).default(obj)

You use it like this:

json.dumps(numpy.float32(1.2), cls=MyEncoder)
json.dumps(numpy.arange(12), cls=MyEncoder)
json.dumps({'a': numpy.int32(42)}, cls=MyEncoder)

etc.

^{¹Or you could just write the default function and pass that as the defaut keyword argument to json.dumps. In this scenario, you'd replace the last line with raise TypeError, but ... meh. The class is more extensible :-)}

For real fun, try this with `np.float64` or `np.bool` and everything works fine, because they're actually subclasses of `float` and `bool`. Once you think about it, it makes sense why those two types are subclasses but none of the other numeric types are, but until you do, it can make for some real fun debugging… — abarnert, Nov 20 '14 at 22:01
@abarnert -- `np.float64` _is_ obvious (after all, it's just C's `double` which is what python uses for `float`), but `np.bool` isn't so much. It could have been a subclass of `np.int32` I would think... Looking at the `__mro__` of `np.int64`, I would expect that one to work too -- at least on python2.x :-). — mgilson, Nov 20 '14 at 22:03
Or you could convert the numpy type to a native type. See http://stackoverflow.com/a/11389998/2486302 for details on how to do this. — Mack, Feb 06 '16 at 15:29
Or you can use the `json_tricks` library which does this by default (disclaimer: I'm the main contributor). — Mark, Sep 19 '17 at 18:15

score 4 · Answer 2 · edited Sep 19 '17 at 18:16

4

You could also convert the array to a python list (use the tolist method) and then convert the list to json.

edited Sep 19 '17 at 18:16

Mark

15,245
6
95
113

answered Aug 02 '15 at 08:44

Emanuele Paolini

8,978
3
32
57

score 2 · Answer 3 · answered Jul 25 '19 at 06:24

You can use our fork of ujson to deal with NumPy int64. caiyunapp/ultrajson: Ultra fast JSON decoder and encoder written in C with Python bindings and NumPy bindings

pip install nujson

Then

>>> import numpy as np
>>> import nujson as ujson
>>> a = {"a": np.int64(100)}
>>> ujson.dumps(a)
'{"a":100}'
>>> a["b"] = np.float64(10.9)
>>> ujson.dumps(a)
'{"a":100,"b":10.9}'
>>> a["c"] = np.str_("12")
>>> ujson.dumps(a)
'{"a":100,"b":10.9,"c":"12"}'
>>> a["d"] = np.array(list(range(10)))
>>> ujson.dumps(a)
'{"a":100,"b":10.9,"c":"12","d":[0,1,2,3,4,5,6,7,8,9]}'
>>> a["e"] = np.repeat(3.9, 4)
>>> ujson.dumps(a)
'{"a":100,"b":10.9,"c":"12","d":[0,1,2,3,4,5,6,7,8,9],"e":[3.9,3.9,3.9,3.9]}'

score 1 · Answer 4 · answered Nov 21 '14 at 01:40

1

If you leave the data in any of the pandas objects, the library supplies a to_json function on Series, DataFrame, and all of the other higher dimension cousins.

See Series.to_json()

answered Nov 21 '14 at 01:40

mobiusklein

1,353
8
12

This wouldn't work as Series.to_json() still can't handle numpy.ndarrays – nivniv Jul 03 '17 at 13:44
1

After a lot of head-banging and resistance to create a custom function or class for a seemingly straightforward problem, this worked for me!! I like it because it keeps things simple! – vk1011 Aug 22 '17 at 23:37

Convert numpy type to python

4 Answers4

Linked

Related