5

As I guess, most of you know that we have the following encodings for files:

  • ANSI
  • UTF-8

UTF-8 is recognized by adding three chars at the beginning of the file but those chars causes some troubles in PHP Language as you know So we use

  • UTF-8 Without BOM (Instead of UTF-8)

Here is my question: How can we write a new file (Using PHP) with the encoding of (UTF-8 Without BOM) either using frwite() or any other function (Doesn't matter)

(I'm not asking about an editor settings> I'm asking about creating a file with php functions)

tchrist
  • 74,913
  • 28
  • 118
  • 169
Karam89
  • 61
  • 1
  • 1
  • 3
  • For the record, "ANSI" (ASCII) uses only 7 bits of each byte, while UTF-8 uses all 8 bits, allowing for an additionally 128 characters. Since they both use a single byte per character, a byte order marker is a bit useless for UTF-8. – Eagle-Eye Apr 03 '13 at 03:46

4 Answers4

2

I'm afraid you have misrepresented both UTF-8 and ANSI in your question.

UTF-8 is not required to have a BOM at its start. There's no such encoding as "UTF-8 without BOM" encoding. There's just "UTF-8". I've processed millions (well, certainly hundreds of thousands) of UTF-8 files and never once come across a BOM at their start.

According to the Unicode standard, a BOM is neither required nor recommended in UTF-8:

2.6 Encoding Schemes

Use of a BOM is neither required nor recommended for UTF-8, but may be encounter in contexts where UTF-8 data is converted from other encoding forms that use a BOM or where the BOM is used as a UTF-8 signature. See the "Byte Order Mark" subsection in Section 16.8, Specials, for more information.

Also, there is no such encoding as "ANSI"!

The closest thing that IANA provides provides to "ANSI" for a character set name is "ANSI_X3.4-1968" and "ANSI_X3.4-1986", which are both just legacy aliases for "US-ASCII" (the preferred MIME name), a 7-bit encoding of 128 code points. There is no other official charset name contains "ANSI" in its name.

I'm not sure what environment you're operating under, but it seems to have led you into some very non-standard naming, expectations, and standards.

Could it perhaps be… Windows™? ☹

EDIT: Just found this answer about the source of this misonymy.

Community
  • 1
  • 1
tchrist
  • 74,913
  • 28
  • 118
  • 169
  • Thanks alot for your answer .. it's really helpful but: First of all I'm really using Windows :P but any way you mentioned that it's not necessary (or not recommended) to put the BOM chars at the beginning of the file so do you have any other way to define a file as UTF-8 in any other way (Without the first three chars) Knowing that PHP creates UTF-8 encoded files with those chars – Karam89 Nov 01 '10 at 18:55
0

Just create a file in PhP and add a normal text. It works for me.

/*creating the page account.php*/

$archivo="account.php";

$myFile = "$archivo";
$fh = fopen($myFile, "wb") or die("can't open file");
fwrite($fh,"$archivo");//In this case I am writing the name of the file inside of it.
fclose($fh);    
masterhoo
  • 89
  • 6
-1

If it is "without BOM", then it is just normal bytes in the file.

They are all just numbers, from 0 to 255, or 0x00 to 0xFF, as bytes sequence in the file. How you interpret them will be up to you or the program.

nonopolarity
  • 130,775
  • 117
  • 415
  • 675
-3

I found a solution for this, you can open/create the file and use this to write a signature at the begining of it in order to set it as UTF8:

$fp = fopen("filename.ext","wb");
fwrite($fp,pack("CCC",0xef,0xbb,0xbf));

I've probed this and works great!

I found it here

Dude2012
  • 113
  • 1
  • 1
    Note that the above won't work if your PHP file is encoded in UTF8 - found out the hard way. – vamur Oct 23 '12 at 14:19
  • 2
    Writing the three bytes making up a BOM strikes me as the surest way of **not** encoding a file as "UTF8 *without* BOM". – LSerni Oct 28 '12 at 18:00