25

Say I got:

app.get('/json', function(req, res) {
    res.set({
        'content-type': 'application/json'
    }).send('{"status": "0"}');
});

I'm trying to send the response as UTF-8 with the following with no success:

app.get('/json', function(req, res) {
    // From Node.js Official Doc
    // http://nodejs.org/api/http.html#http_http_request_options_callback
    res.setEncoding('utf8');

    res.set({
        'content-type': 'application/json'
    }).send('{"status": "0"}');
});

What is the correct way to set character encoding in Express?

gsklee
  • 4,264
  • 4
  • 36
  • 55

5 Answers5

32

You will probably want to explicitly add a charset to the end of your content-type string if you find it's not being set already by Express:

 res.set({ 'content-type': 'application/json; charset=utf-8' });

The charset is not always set automagically and does need to be set to work correctly everywhere (i.e. with all browsers and all ajax libraries) or you can run into encoding bugs.

In Express 4.x specifically I've found that depending on the object you trying to return, it normally automatically returns with content-type: application/json; charset=utf-8 when you call res.json(someObject), however not always.

When calling res.json() on some objects it can return content-type: application/json (i.e. without the charset encoding!). I'm not actually sure what triggers this, other than it's something about the specific object being returned.

I've only noticed it because of automated tests which explicitly checked the headers and found it was missing the charset declaration on some responses (even though the content-type was still application/json).

blue112
  • 41,908
  • 3
  • 40
  • 53
Iain Collins
  • 6,139
  • 2
  • 40
  • 38
  • 3
    "charset" should only be used on text/* resources. "application/json" is UTF-8 by definition; there's no need to specify it. – Rich Remer Jul 19 '15 at 08:09
  • 1
    @RichRemer According to the RFC `application/json` _should_ be always UTF (specifically UTF-8 by default) and _shouldn't_ have a charset property, but in practice if you don't set it many consumers will mangle the result [including some browsers](http://stackoverflow.com/questions/25267649/why-do-some-browsers-seem-to-require-a-utf-8-charset-on-json-data-for-display), which is why it's a common pattern. – Iain Collins Jul 19 '15 at 23:13
  • 2
    Even if there are clients which choke on that, you're introducing a problem to conforming clients because they shouldn't expect to strip the charset from an application/* media type. Better to stop using broken clients than to break all the working ones. – Rich Remer Jul 20 '15 at 02:02
  • 2
    @RichRemer Adding this information doesn't cause a problem for confirming clients. This is explicitly stated in RFC 7159 (and quoted in the answer linked to above). – Iain Collins Jul 20 '15 at 11:43
  • You can try removing charset with `res.set({ 'content-type': 'application/json' });` but expressjs will add it again even if you're not sending json. – Marc Oct 01 '20 at 09:42
13

Use res.charset: http://expressjs.com/api.html#res.charset

res.charset = 'value';
res.send('some html');
// => Content-Type: text/html; charset=value

However, JSON is UTF-8 by default so you don't need to set anything.

Dan Kohn
  • 31,010
  • 8
  • 77
  • 99
  • 3
    This has changed for Express 4. See here: https://github.com/visionmedia/express/wiki/Migrating%20from%203.x%20to%204.x#rescharset – Deiwin Apr 19 '14 at 16:08
  • 2
    The web-browser will not necessarily interpret the JSON as UTF-8 when you view it as text. It can be crazy confusing while you're debugging your app. Setting res.charSet is still a good idea. – cleong Mar 05 '15 at 11:47
  • 1
    Agreed with @Deiwin. It`s necessary to specify the charset. It won`t be interpreted as UTF-8 by default. Solved my problem in my case. – Saeger Mar 23 '15 at 18:17
3

This worked for me

res.writeHead(200, {'Content-Type': 'text/html; charset=utf-8'});
Siva
  • 433
  • 1
  • 7
2

Having similar issues I'm collecting Swedish characters from a database and outputting them as JSON object, node doesn't really care if json must be UTF-8 or not when the chars from the database isn't in UTF-8.. So assuming "you don't need to set anything" is false. Depending on what charsets you are working with.

Peter Badida
  • 7,432
  • 8
  • 33
  • 74
1

Before you go to the trouble of manually setting header parameters, check what your server is already sending by default. In my case, I'm using a "serverless" cloud provided Node.js instance. Apparently, these are usually front-ended w/ NGINX which I assume is what sets some of this stuff based on default settings. ...I didn't need to res.set anything at all. Granted, I'm serving back HTML, ...just sayin - before you go fixin, make sure it's broke.

accept-ranges: bytes
accept-ranges: bytes
cache-control: private
content-encoding: gzip
content-type: text/html; charset=utf-8
date: Fri, 21 Dec 2018 21:40:37 GMT
etag: W/"83-xwilN/BBLLLAAAHHH/0NBLAH0U"
function-execution-id: 5thvkjd4wwru
server: nginx
status: 200
vary: accept-encoding, cookie, authorization
via: 1.1 varnish
x-cache: MISS
x-cache-hits: 0
x-cloud-trace-context: 18c611BBBBLLLLAAAHHH9594d9;o=1
x-powered-by: Express
x-served-by: cache-dfw18631-DFW
x-timer: S15BBLLLAAHHH.913934,VS0,VE3404 
Ronnie Royston
  • 11,959
  • 5
  • 57
  • 72