1

Is there any way to omit the byte-order mark when redirecting the output stream to a file? For example, if I want to take the contents of an XML file and replace a string with a new value, I need to do create a new encoding and write the new output to a file like the following which is rather ham-handed:

$newContent = ( Get-Content .\settings.xml ) -replace 'expression', 'newvalue'
$UTF8NoBom = New-Object System.Text.UTF8Encoding( $false )
[System.IO.File]::WriteAllText( '.\settings.xml', $newContent, $UTF8NoBom )

I have also tried using Out-File, but specifying UTF8 as the encoding still contains a BOM:

( Get-Content .\settings.xml ) -replace 'expression', 'newvalue' | Out-File -Encoding 'UTF8' .\settings.xml

What I want to be able to do is simply redirect to a file without a BOM:

( Get-Content .\settings.xml ) -replace 'expression, 'newvalue' > settings.xml

The problem is that the BOM which is added to the output file routinely cause issues when reading these files from other applications (most notably, most applications which read an XML blow up if I modify the XML and it begins with a BOM, Chef Client also doesn't like a BOM in a JSON attributes file). Short of me writing a function like Write-FileWithoutBom to accept pipeline input and an output path, is there any way I can simply "turn off" writing a BOM when redirecting output to a file?

The solution doesn't necessarily have to use the redirection operator. If there is a built-in cmdlet I can use to output to a file without a BOM, that would be acceptable as well.

Bender the Greatest
  • 12,311
  • 17
  • 65
  • 124

1 Answers1

5
  • In Windows PowerShell as of v5.1, there is NO built-in way to avoid creating a BOM with UTF-8 encoding (short of calling the .NET framework directly, as you demonstrate).

    • In v5.1+ you can change the default encoding for > / >> as follows, but if you choose utf8, you still get a BOM:

      • $PSDefaultParameterValues['Out-File:Encoding'] = 'utf8'
    • Consider use of third-party function Out-FileUtf8NoBom from this answer of mine.

    • Unfortunately, it is unlikely that Windows PowerShell will ever support creation of BOM-less UTF-8 files[1] , but the hope is that PowerShell Core, which not only supports that but also defaults to interpreting BOM-less files as UTF-8 (see below), will eventually be a viable alternative on Windows.

  • PowerShell Core, by contrast, uses BOM-less UTF-8 by default (both for Out-File / > and Set-Content) and offers you a choice of BOM or no-BOM via -Encoding specifiers utf8 and utf8BOM.


[1] From a Microsoft blog post, emphasis added: "Windows PowerShell 5.1, much like .NET Framework 4.x, will continue to be a built-in, supported component of Windows 10 and Windows Server 2016. However, it will likely not receive major feature updates or lower-priority bug fixes." and, in a comment, "The goal with PowerShell Core 6.0 and all the compatibility shims is to supplant the need for Windows PowerShell 6.0 while converging the ecosystem on PowerShell Core. So no, we currently don’t have any plans to do a Windows PowerShell 6.0."

mklement0
  • 245,023
  • 45
  • 419
  • 492
  • Not the answer I wanted to hear, but I appreciate you taking the time to explain *how* the default encoding can be overridden. It's also good to know `Powershell Core` does support this, so it hopefully means `Microsoft Powershell 6.0` will also support it. – Bender the Greatest Aug 16 '18 at 15:42
  • @BendertheGreatest: Unfortunately, it sounds like _Windows PowerShell_ (as opposed to PowerShell _Core_) will not gain any new features - please see my update. – mklement0 Aug 16 '18 at 16:07
  • 1
    Sorry, meant to say `Windows Powershell`, not `Microsoft`. Guess I'm late to the party, but at least Microsoft has committed to which flavor of Powershell they will be developing moving forward. – Bender the Greatest Aug 17 '18 at 15:10