9

I have a configuration file in JSON that contains a few variables as strings (always ascii). These strings are decoded to unicode by default but since I have to pass these variables to my Python C Extensions I need them as normal Python strings. At the moment I'm using str(unicode) to convert the JSON strings but a more elegant and less verbose solution would be much appreciated.

Is there a way to change the default translation from string to unicode with a custom JSONDecoder or object hook?

SilentGhost
  • 264,945
  • 58
  • 291
  • 279
Adrian
  • 1,707
  • 1
  • 18
  • 21
  • Yes, a custom JSONDecoder should be able to skip the decoding from str to Unicode and return the raw binary string. – Lennart Regebro Jan 18 '11 at 13:57
  • @Lennart Regebro I tried to do that and failed: I had to copy-n-paste-extend a lot of classes and module private constants. Is there a simple way to just tweak JSONDecoder that I can't see? – TryPyPy Jan 19 '11 at 20:48
  • @TryPyPy: There is no such thing as module private constants in Python... – Lennart Regebro Jan 19 '11 at 20:53
  • OMG. I don't know what kind of brain rot got to me, but if you actually look at my code below (before I make it saner), I acted as if _var + __all__ were impenetrable. Wow, thanks for opening my eyes, I can't explain what happened here... – TryPyPy Jan 19 '11 at 21:12
  • :-) I looked at the code, and I have to agree it's fairly hard to override though... – Lennart Regebro Jan 19 '11 at 21:59
  • import json foo = "{'bar': 'baz'}" json.loads(foo, 'ascii') – lehins Mar 15 '13 at 07:16

2 Answers2

1

Not if you're not willing to lose some speed. If being somewhat slower is OK, you have to consider that using plain json.loads and recursively converting to str is probably cheaper and maybe faster.

With all that said, if you do want a loads that returns strings badly enough to accept going through extending code that wasn't meant to, here's one possible result (mostly extending through copy-n-paste) this was asinine, thanks Lennart for making me see the light (i.e., you just need to extend JSONDecoder and a couple of tricks):

import json
from json import decoder, scanner

from json.scanner import make_scanner
from _json import scanstring as c_scanstring

_CONSTANTS = json.decoder._CONSTANTS

py_make_scanner = scanner.py_make_scanner

# Convert from unicode to str
def str_scanstring(*args, **kwargs):
    result = c_scanstring(*args, **kwargs)
    return str(result[0]), result[1]

# Little dirty trick here
json.decoder.scanstring = str_scanstring

class StrJSONDecoder(decoder.JSONDecoder):
    def __init__(self, encoding=None, object_hook=None, parse_float=None,
            parse_int=None, parse_constant=None, strict=True,
            object_pairs_hook=None):
        self.encoding = encoding
        self.object_hook = object_hook
        self.object_pairs_hook = object_pairs_hook
        self.parse_float = parse_float or float
        self.parse_int = parse_int or int
        self.parse_constant = parse_constant or _CONSTANTS.__getitem__
        self.strict = strict
        self.parse_object = decoder.JSONObject
        self.parse_array = decoder.JSONArray
        self.parse_string = str_scanstring
        self.scan_once = py_make_scanner(self)

# And another little dirty trick there    
_default_decoder = StrJSONDecoder(encoding=None, object_hook=None,
                               object_pairs_hook=None)

json._default_decoder = _default_decoder

j = {1:'2', 1.1:[1,2,3], u'test': {12:12, 13:'o'}}
print json.loads(json.dumps(j))
TryPyPy
  • 5,886
  • 4
  • 32
  • 63
  • Thanks for the detailed answer. I realise now that what I wanted is not supported for a reason, so I will stick to the str(unicode) solution. – Adrian Jan 18 '11 at 14:05
  • Sorry to have scared you, Lennart made me realize it's much easier to get what you wanted. – TryPyPy Jan 19 '11 at 22:00
0

See if the responses to this question helps you (in that question the asker was using simplejson).

Community
  • 1
  • 1
dusan
  • 8,524
  • 2
  • 31
  • 54