1

<script type="text/javascript" charset="UTF-8"> doesn't seem to do the work :(

I'll start from the top, because the solutiona I am looking for might not be the one I'm asking for.

So I'm on a simple CRM. One of it's 'simple' features is to register a request from customer that is not yet in Customers database - if CRM can't find a customer with name matching the one provided, then new customer is registered. This very popular feature leads to a lot of Customers entities whose all but comany name atributes are empty.

Later on users do add contact attributes, but the workflow is quite slow and boring. To speed it up i wrote a python function that looks up a customer by it's name on local yellow pages, parses HTML and turns it into a dict(). It takes quite some time, so I put it on seperate query page, that is red by JavaScript upon request.

the query page renders dictionary as a string like this: [{'name': '"AG \xc4\x80bolti\xc5\x86a b\xc5\xabvuz\xc5\x86\xc4\x93mums", SIA', 'phone': '67244222'}] Notice literal UTF-8 coding

$.get('/zl?name=AG SIA ĀBOLTIŅA BŪVUZŅĒMUMS', function(data){... retrieves same way it looks above. To turn retrieved dataString into Object result = eval(data)[0]. So now $('#zl').html(result.name) puts "AG ÄboltiÅa bÅ«vuzÅÄmums", SIA not "ĀBOLTIŅA BŪVUZŅĒMUMS" as it should be.

I have <meta http-equiv="content-type" content="text/html; charset=UTF-8"/> in page header and script begins with <script type="text/javascript" charset="UTF-8"> but nothing helps.

So Questions:

  1. how to set a proper encoding of JS output? is there some trick method like string.toProperUTF-8()?
  2. Is there a way to .decode(UTF-8) a python dictionary without causing server error?
  3. is there a less problematic way to do it?
Bibhas Debnath
  • 13,319
  • 14
  • 64
  • 93
Edmund Sulzanok
  • 1,570
  • 2
  • 18
  • 34
  • 1
    I think the `Content-Type` HTTP header is more relevant than the `charset` attribute of the script tag. – Bergi Mar 26 '13 at 15:02
  • 4
    Why not use JSON for this? JSON is a clear, defined standard that includes what encoding is acceptable, making it easy for implementors to get that part *correct*. Python has a `json` module, making it easy for you to produce correct JSON with Unicode data. – Martijn Pieters Mar 26 '13 at 15:03
  • @MartijnPieters: It seems that the OP is using JSON already, only he can't get the encoding of the JSON string correctly. – Bergi Mar 26 '13 at 15:10
  • @Bergi: I meant for the OP to use standard JSON tools *on both sides* here. Clearly, with brice's answer accepted below, that was the correct course of action here. – Martijn Pieters Mar 26 '13 at 15:19
  • @Bergi what is OP? And no I wan't using JSON – Edmund Sulzanok Mar 26 '13 at 15:25
  • @Edmund: "OP" is "Original Poster", i.e. you – Bergi Mar 26 '13 at 15:30

1 Answers1

3

This will not work because Javascript and Python have different character escape semantics

If I get this right, you're evaling the json you get to turn it into an object when it gets to your javascript.

Please don't do this.

Instead, use python's built-in json module to reply to requests with properly formatted json, and then use JSON.parse() in your javascript to read the reply. This should ensure that the parsing occurs correctly.

A quick example, starting with the python:

import json 

# lots of code...

@route("/my_page")
def my_page(request):
  return json.dumps({
    "siblings":["Eenie", "Meanie", "Meinie", "Mo"], 
    "ages":[1,2,3,4]
  })

# lots more code...

now called from Javascript:

$.getJSON('/my_page', function(data, textStatus, jqXHR) {
  /* Do stuff with the object */
});

Note how I use the jQuery built-in getJSON() to get the object. You could use plain old AJAX and JSON.parse() too.

Why eval() is a bad idea

Using eval instead of JSON.parse() actually opens your code to a large number of security vulnerabilities. If a malicious user (or a distracted developer) can place JavaScript code in the request, the code has free reign of you page, which compromises you user's security and can cause some very hard to find bugs. Production pages should not use eval() under any circumstances, in particular when the data being eval'ed is untrusted, as it would be if it came from the network.

A random note on mime types

While this won't cause problems in most cases, you should ensure that your python code returns the correct mime type for JSON. Getting this wrong could cause weird problems that are tricky to debug. How to do this will depend on the python framework you are using.

Community
  • 1
  • 1
brice
  • 21,825
  • 7
  • 73
  • 94