1

I am trying to read a .xml file, change some values (not yet), and write it back out. Without making any changes, I expect to get the same thing comeing out as went in. It does not.

PS H:\src\tws> type .\test000.xml
<?xml version="1.0"?>
<eventRuleSet xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
              xmlns="http://www.ibm.com/xmlns/prod/tws/1.0/event-management/rules"
              xsi:schemaLocation="http://www.ibm.com/xmlns/prod/tws/1.0/event-management/rules http://www.ibm.com/xmlns/prod/tws/1.0/event-management/rules/EventRules.xsd">
    <eventRule name="PW-TEST001" ruleType="filter" isDraft="no">
        <description>Paul's test001</description>
    </eventRule>
</eventRuleSet>

Here is the simple code I am using to read it in and write it out.

PS H:\src\tws> Get-Content .\con000.ps1
$x = [xml](Get-Content -Path .\test000.xml)
$x | Export-Clixml -Path .\con000.xml -Encoding utf8

The output has and sections. Why is that? I would like to get out what went in. I do not care about newlines or the use of HTML entities. I just want the content to be the seme. Yes, the plan is to read a template, change some values, and output a new .xml file. This will be input to the IBM/HCS Workload Scheduler.

PS H:\src\tws> type .\con000.xml
<Objs Version="1.1.0.1" xmlns="http://schemas.microsoft.com/powershell/2004/04">
  <XD>&lt;?xml version="1.0"?&gt;&lt;eventRuleSet xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.ibm.com/xmlns/prod/tws/1.0/event-management/rules" x
si:schemaLocation="http://www.ibm.com/xmlns/prod/tws/1.0/event-management/rules http://www.ibm.com/xmlns/prod/tws/1.0/event-management/rules/EventRules.xsd"&gt;&lt;eventRule
name="PW-TEST001" ruleType="filter" isDraft="no"&gt;&lt;description&gt;Paul's test001&lt;/description&gt;&lt;/eventRule&gt;&lt;/eventRuleSet&gt;</XD>
</Objs>
lit
  • 10,936
  • 7
  • 49
  • 80

2 Answers2

3
  • The purpose of Export-CliXml is to serialize arbitrary objects for later deserialization via Import-CliXml, using a best-effort representation with respect to preserving the specific input types for later "rehydration" via Import-CliXml.

  • Its purpose is not to write the text representation of arbitrary [xml] documents to a file.

In order to save a text representation of an [xml] instance to a file, you have two basic choices:

  • If the specific formatting of the textual representation of the XML document is not a concern, simply call .OuterXml on the (modified) [xml] instance and send that to a file - either via Set-Content or via Out-File / >, but note the different default character encodings applied by these cmdlets in Windows PowerShell.

  • Use the .NET framework, if you want a pretty-printed textual representation of the XML in the output file:

    • The [xml] type's .Save() method conveniently performs implicit pretty-printing when saving to a file, but there are pitfalls:

      • Since .NET usually has a different idea of what the current directory is, be sure to pass a full file path.

      • In the absence of an XML declaration with an encoding attribute, the method creates a UTF-8 file without BOM (which is preferable from a cross-platform perspective).

      • By contrast, curiously, if an XML declaration with encoding="UTF-8" is present, the resulting file will be UTF-8 with a BOM, as of .NET Core 2.1 / .NET v4.7; see this GitHub issue.

    • Use a [System.Xml.XmlWriter] instance with an explicitly created file-stream object, which is more cumbersome but gives you control over the specifics of the pretty-printed format.


Here's a simple example with .OuterXml:

# Read the input file into an XML document (in-memory DOM).
$x = [xml] (Get-Content -Raw ./test000.xml)

# Make updates to the in-memory document
$x.eventRuleSet.eventRule.description = 'new description'

# Save the modified document as text to an output file,
# using the un-prettied textual representation provided by the .OuterXml
# property.
# If *BOM-less* UTF-8 encoding is what you want, simply use
#   $x.Save("$PWD/con000.xml")
# In PowerShell *Core*, you'd get BOM-less UTF-8 even with the command below.
$x.OuterXml | Set-Content -Encoding utf8 ./con000.xml

A note re the use of a BOM (a.k.a. Unicode signature) with UTF-8 and other Unicode encodings:

  • In Windows PowerShell, -Encoding utf8 invariably creates a BOM (applies not just to Set-Content, but also to other cmdlets that produce file output, such as Out-File and Export-Csv).

    • Direct use of the .NET framework is required to create BOM-less UTF-8 files (for a PowerShell-friendly wrapper function, see this answer of mine). Note that the .NET framework's default encoding has always been BOM-less UTF-8.
  • PowerShell Core creates BOM-less UTF-8 files by default (and also when you explicitly use
    -Encoding utf8); you can opt to have a BOM created with -Encoding utf8BOM.

For best overall compatibility, BOMs in UTF-8 files should be avoided: Unix platforms and Unix-heritage utilities also used on Windows Platforms generally don't know how to handle them.

Similarly, -Encoding UTF7 should be avoided, because it is not a standard Unicode encoding (and is written without a BOM in both PowerShell editions).

In both PowerShell editions, all other Unicode encodings available with -Encoding do create an (encoding-appropriate) BOM: Unicode (UTF-16LE), bigendianunicode (UTF-16BE), and utf32 (UTF-32).

mklement0
  • 245,023
  • 45
  • 419
  • 492
0

Play around with this code in the debugger.

$data1 = @"
<?xml version="1.0"?>
<eventRuleSet xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
              xmlns="http://www.ibm.com/xmlns/prod/tws/1.0/event-management/rules"
              xsi:schemaLocation="http://www.ibm.com/xmlns/prod/tws/1.0/event-management/rules http://www.ibm.com/xmlns/prod/tws/1.0/event-management/rules/EventRules.xsd">
    <eventRule name="PW-TEST001" ruleType="filter" isDraft="no">
        <description>Paul's test001</description>
    </eventRule>
</eventRuleSet>
"@

$xml1 = [xml]$data1

"`n-------data1"
$data1

"`n--------xml1"
$xml1

"`n--------save to file xml2"
$xml1.Save('d:\test\xml2.xml')

$file2 = Get-Content 'd:\test\xml2.xml'
$xml2 = [xml]$file2

"`n--------file2"
$file2

"`n--------edit"
$xml2.eventRuleSet.eventRule.name = "Hello world!"

"`n--------save to file xml3"
$xml2.Save('d:\test\xml3.xml')

$file3 = Get-Content 'd:\test\xml3.xml'

"`n--------file3"
$file3
Kory Gill
  • 6,521
  • 1
  • 19
  • 29