293

Say I have a function:

def NewFunction():
    return '£'

I want to print some stuff with a pound sign in front of it and it prints an error when I try to run this program, this error message is displayed:

SyntaxError: Non-ASCII character '\xa3' in file 'blah' but no encoding declared;
see http://www.python.org/peps/pep-0263.html for details

Can anyone inform me how I can include a pound sign in my return function? I'm basically using it in a class and it's within the '__str__' part that the pound sign is included.

hichris123
  • 9,735
  • 15
  • 51
  • 66
SNIFFER_dog
  • 2,993
  • 2
  • 12
  • 4
  • 46
    Did you even read the PEP you linked to? It describes what the problem is and how to fix it. – murgatroid99 May 14 '12 at 19:15
  • 2
    "Can anyone inform me how I can include a pound sign in my return function." Well, the error message says "see http://www.python.org/peps/pep-0263.html for details"; perhaps you should start there? – Karl Knechtel May 14 '12 at 20:23
  • 7
    @murgatroid99 Here's what you and at the time I type this 27 others are missing: Yes of course I'll read the PEP. Difficulty level: I got this trying to run /bin/sh against a docker container. I'm not overtly trying to run Python. So all the PEP is going to tell me is how to fix the python code I'm not trying to run and didn't write. I was hoping for more context from StackOverflow, got smugness instead. :( Further searching turned up the actual answer: https://stackoverflow.com/questions/38992850/trying-to-run-cloudera-image-in-docker/logout - notice how the PEP did exactly zero to help. – Mark Allen Oct 09 '17 at 18:53
  • @MarkAllen - in your linked answer, the error message indicates that python is trying to interpret "/bin/bash" - it's admittedly something easy to overlook, but nothing in _this_ question indicates it's to do with docker or a container, so the advice here as you've found doesn't apply to your problem - it's not smugness, it's just that there's context in your problem, that's not present here. – tanantish Jan 03 '19 at 19:39
  • 1
    @tanantish I stand by what I said. I got the error in the question. Rather than give useful information people this was met with, "Did you even read the PEP you linked to?" and, "Well the error message says see (blah), prehaps you should start there?" – Mark Allen Jan 07 '19 at 18:44
  • @murgatroid99 Stackoverflow is for finding answers, Q&A style. People want to ask google questions, and be directed to the answer. Finding answers this way is WAY faster than reading pages and pages of documentation, which is usualy poorly designed for human readability. Asking questions here is GOOD, even if the answer can be found elsewhere. You could find another place to "How to exit vim?" or "How to print in python 2.7", but why would you? – Gulzar May 02 '19 at 13:32
  • The duplicate https://stackoverflow.com/questions/21639275/python-syntaxerror-non-ascii-character-xe2-in-file?noredirect=1 has an accepted answer for Python 2 which helps you find any characters which would trigger this error in your source file. – tripleee Feb 02 '21 at 06:14

6 Answers6

378

I'd recommend reading that PEP the error gives you. The problem is that your code is trying to use the ASCII encoding, but the pound symbol is not an ASCII character. Try using UTF-8 encoding. You can start by putting # -*- coding: utf-8 -*- at the top of your .py file. To get more advanced, you can also define encodings on a string by string basis in your code. However, if you are trying to put the pound sign literal in to your code, you'll need an encoding that supports it for the entire file.

Silas Ray
  • 24,298
  • 5
  • 44
  • 60
322

Adding the following two lines at the top of my .py script worked for me (first line was necessary):

#!/usr/bin/env python
# -*- coding: utf-8 -*- 
tripleee
  • 139,311
  • 24
  • 207
  • 268
Timothée HENRY
  • 12,701
  • 16
  • 82
  • 132
  • I got the same problem and my Python is 2.7.11. After adding the the second line `# -*- coding: utf-8 -*-` to the top of the file, it resolved the problem. – hailong Jun 22 '16 at 14:29
  • 2
    First line is to make the py file executable on *nix. It is not really related to this question. – cmd Dec 15 '17 at 20:21
  • Of course, this doesn't help at all if the file's actual encoding is not UTF-8, as seems to be the case here. – tripleee Jun 28 '20 at 11:49
58

First add the # -*- coding: utf-8 -*- line to the beginning of the file and then use u'foo' for all your non-ASCII unicode data:

def NewFunction():
    return u'£'

or use the magic available since Python 2.6 to make it automatic:

from __future__ import unicode_literals
plaes
  • 28,313
  • 10
  • 81
  • 85
  • 12
    If you have ``# -*- coding: utf-8 -*-`` you don't need to prefix your unicode strings with ``u`` – Daniel Lee Jun 13 '13 at 19:28
  • @plaes what about if it's on a variable? example by reading a file? I can't use uVariable, how I do it? – Skizo-ozᴉʞS Mar 03 '17 at 15:36
  • 1
    @DanielLee Except this is not true. `# -*- coding: utf-8 -*-` followed by `print 'błąd'` will output garbage, while `print u'błąd'` works. – Przemek D Nov 23 '17 at 09:32
  • @DanielLee What Przemek D said. Putting UTF-8 literals into your source code like that is generally not a good idea, and can lead to unwanted behaviour, especially in Python 2. If literals aren't pure 7 bit ASCII they should be actual Unicode, not UTF-8, so in Python 2 you should put the `u` prefix on such literals. In Python 3, plain strings are Unicode anyway, but the `u` prefix is permitted in recent versions of Python 3 to make it a little easier to write code which behaves correctly in both Python 2 & 3. – PM 2Ring Jun 13 '18 at 09:01
  • 1
    @Skizo-ozᴉʞS This particular error message (in the title of this question) would not happen in either of those scenarios. Generally speaking, you need to specify the encoding of any file you read, and if you want to print something to a device which uses a specific encoding, similarly specify the encoding or manually convert when you write. Python 3 simplifies this a lot, though there are still corner cases where you have to specify the encoding explicitly. Perhaps see also https://nedbatchelder.com/text/unipain.html – tripleee Feb 02 '21 at 06:18
12

The error message tells you exactly what's wrong. The Python interpreter needs to know the encoding of the non-ASCII character.

If you want to return U+00A3 then you can say

return u'\u00a3'

which represents this character in pure ASCII by way of a Unicode escape sequence. If you want to return a byte string containing the literal byte 0xA3, that's

return b'\xa3'

(where in Python 2 the b is implicit; but explicit is better than implicit).

The linked PEP in the error message instructs you exactly how to tell Python "this file is not pure ASCII; here's the encoding I'm using". If the encoding is UTF-8, that would be

# coding=utf-8

or the Emacs-compatible

# -*- encoding: utf-8 -*-

If you don't know which encoding your editor uses to save this file, examine it with something like a hex editor and some googling. The Stack Overflow tag has a tag info page with more information and some troubleshooting tips.

In so many words, outside of the 7-bit ASCII range (0x00-0x7F), Python can't and mustn't guess what string a sequence of bytes represents. https://tripleee.github.io/8bit#a3 shows 21 possible interpretations for the byte 0xA3 and that's only from the legacy 8-bit encodings; but it could also very well be the first byte of a multi-byte encoding. But in fact, I would guess you are actually using Latin-1, so you should have

# coding: latin-1

as the first or second line of your source file. Anyway, without knowledge of which character the byte is supposed to represent, a human would not be able to guess this, either.

A caveat: coding: latin-1 will definitely remove the error message (because there are no byte sequences which are not technically permitted in this encoding), but might produce completely the wrong result when the code is interpreted if the actual encoding is something else. You really have to know the encoding of the file with complete certainty when you declare the encoding.

tripleee
  • 139,311
  • 24
  • 207
  • 268
  • This is an adaptation of an earlier answer of mine to a duplicate question: https://stackoverflow.com/a/50829958/874188 – tripleee Jun 13 '18 at 07:44
  • Python 3 defaults to UTF-8 for source files, and you should probably be using UTF-8 for everything these days anyway. http://utf8everywhere.org/ – tripleee Jun 29 '18 at 03:51
12

Adding the following two lines in the script solved the issue for me.

# !/usr/bin/python
# coding=utf-8

Hope it helps !

Ebin Zacharias
  • 139
  • 1
  • 7
  • This effectively duplicates an earlier answer from 2013. What exactly to put in the shebang on the first line is somewhat system-dependent, but outside the scope of the discussion here. – tripleee Jun 28 '20 at 11:56
  • Also, you can't have a space between `#` and `!` – tripleee Mar 10 '21 at 06:39
5

You're probably trying to run Python 3 file with Python 2 interpreter. Currently (as of 2019), python command defaults to Python 2 when both versions are installed, on Windows and most Linux distributions.

But in case you're indeed working on a Python 2 script, a not yet mentioned on this page solution is to resave the file in UTF-8+BOM encoding, that will add three special bytes to the start of the file, they will explicitly inform the Python interpreter (and your text editor) about the file encoding.

user
  • 16,328
  • 6
  • 89
  • 89