Questions tagged [byte-order-mark]

A byte order mark (BOM) is a Unicode character used to signal the order of bytes in a text file or stream. As the BOM is U+FEFF, it makes it clear whether the high-order bytes are first (stream starts FE.FF) or second (stream starts FF.FE).

The byte order mark (BOM) is a Unicode character used to signal the endianness (byte order) of a text file or stream. Use of a BOM is optional and, if used, it should appear at the start of the text stream. Beyond its specific use as a byte-order indicator, the BOM character may also indicate which of the several Unicode representations the text is encoded in.

For example, the use of a UTF-16 BOM (U+FEFF) makes it clear from the first two bytes of a text whether the stream is "big endian" (BE) — like Western numbers, so the stream would start FE FF ... — or "little endian" (LE) — like numbers in Arabic, so the stream would start FF FE .... If misinterpreted as ISO-8859-1, it would show up as þÿ (BE) or ÿþ (LE).

In UTF-8, a BOM is neither required nor recommended, but would be the three bytes 0xEF 0xBB 0xBF. When misinterpreted as ISO-8859-1, this renders as . Seeing this triplet in unusual places in code output almost always indicates that a BOM is not being ignored when it should be, or was added where it was not expected.

In UTF-32, the same BOM is used as for UTF-16 but, as 32-bits are used for each character (so U+0000FEFF), then its ASCII-8859-1 misinterpretation would contain null characters: □□þÿ (BE) or ÿþ□□ (LE), where represents the ASCII NUL character.

More information

539 questions
0
votes
1 answer

Flex fileReference save in utf-8 with BOM encoding format

I want to save a file in utf-8 encoding format (NOT utf-8 without BOM). There is no explicit information on Adobe Charset support page (http://help.adobe.com/en_US/FlashPlatform/reference/actionscript/3/charset-codes.html) I want to save the file…
yousafsajjad
  • 923
  • 2
  • 18
  • 33
-1
votes
1 answer

How can I remove the BOM from a UTF-16 LE file in C++?

I have a UTF-16 LE file and at the beginning it has a BOM. How do I remove this using C++? I've seen many Python examples. Ultimately I would like it to be a UTF-8.
JeffR
  • 527
  • 1
  • 6
  • 17
-1
votes
3 answers

CakePHP "Cannot modify header information" problem is NOT whitespace

Here's the error: Warning (2): Cannot modify header information - headers already sent by (output started at /usr/share/php/cake/basics.php:111) [CORE/cake/libs/controller/controller.php, line 640] $status = "Location:…
lollercoaster
  • 13,421
  • 28
  • 94
  • 162
-1
votes
1 answer

how to remove BOM while writing csv in golang

I am writing a stats in a csv file for incoming diameter traffic in my golang server but the file contain a "" character at the start of rach…
-1
votes
1 answer

How can I use C++ to eliminate the BOM in a notepad .txt file?

I want to read in a .txt file using ifstream fin from library fstream, but there is a BOM at the beginning of the file that is causing problems. Is there a way I can, from inside my C++ program, eliminate the BOM in the .txt file, so that fin can…
-1
votes
5 answers

Unexpected amount of lines when writing to a csv file

A part of my application writes data to a .csv file in the following way: public class ExampleWriter { public static final int COUNT = 10_000; public static final String FILE = "test.csv"; public static void main(String[] args) throws…
Sander_M
  • 1,019
  • 15
  • 30
-1
votes
1 answer

How remove BOM mark from download file?

I have this script to let the user download file: header('Content-Encoding: UTF-8'); header("Content-Type: application/vnd.ms-excel; charset=UTF-8"); header("Content-Disposition: attachment; filename=qa_report.xlsx"); header("Expires:…
One Man Crew
  • 8,885
  • 2
  • 37
  • 50
-1
votes
1 answer

python - "SyntaxError: encoding issue: with BOM"

I am trying to run some cronjobs in django. I have three of them, 2 of them are running flawlessly. but the third one is giving me the error: ../../monthly_abo_live.py", line 1 SyntaxError: encoding problem: with BOM the first 2 lines of this…
doniyor
  • 31,751
  • 50
  • 146
  • 233
-1
votes
1 answer

Strange letters showing up?

Well I had a problem with include() with blank area. Someone said open it with notepad++ and save as UTF-8 without BOM. Now I can see strange letters at blank area. How can I remove that? And is there a way to create php without BOM in dreamweaver…
-2
votes
2 answers

Why does GNU Diff not understand UTF-16 (only UTF-8)?

Why doesn't GNU Diff understand UTF-16 (only UTF-8)? This GNU Diff is used by default in Git. Why doesn't this bug get fixed? BOM is part of the Unicode standard. http://www.unicode.org/faq/utf_bom.html#bom4 Why is BOM ignored by most…
Keepun
  • 65
  • 4
-2
votes
1 answer

Why awk does not remove BOM from the middle of a line?

I try to use awk to remove all byte order marks from a file (I have many of them): awk '{sub(/\xEF\xBB\xBF/,"")}{print}' f1.txt > f2.txt It seems to remove all the BOMs that are in the beginning of the line but those in the middle are not removed.…
Roman
  • 97,757
  • 149
  • 317
  • 426
-2
votes
1 answer

How do you easily delete a Byte Order Mark (BOM) from a .java file or other types of file?

I kept having compile errors on my Continuous Integration & Deployment system. After some research I found that it was a BOM (Byte Order Mark) at the beginning of a .java file that was causing the error. What is the easiest way to remove the BOM…
ConfusedDeer
  • 2,895
  • 7
  • 35
  • 60
-2
votes
3 answers

How to add a BOM to an HTML document

From the W3C: If an HTML document does not start with a BOM, and its encoding is not explicitly given by Content-Type metadata, and the document is not an iframe srcdoc document, then the character encoding used must be an ASCII-compatible…
user2284570
  • 2,425
  • 3
  • 18
  • 59
-3
votes
3 answers

BOM caused by _GPL_e6a00_parent_div?

When I reload my php page, in the top left corner it's displaying:  I've been searching for it and I got the BOM issue. But I've another issue: In the exactly same position, inspecting the element, it has something that seems some kind of hack.…
1 2 3
35
36