7

I am trying to manipulate json file data in powershell and write it back to the file. Even before the manipulation, when I just read from the file, convert it to Json object in powershell and write it back to the file, some characters are being replaced by some codes. Following is my code:

$jsonFileData = Get-Content $jsonFileLocation

$jsonObject = $jsonFileData | ConvertFrom-Json

... (Modify jsonObject) # Commented out this code to write back the same object

$jsonFileDataToWrite = $jsonObject | ConvertTo-Json

$jsonFileDataToWrite | Out-File $jsonFileLocation

Some characters are being replaced by their codes. E.g.:

< is replaced by \u003c
> is replaced by \u003e. 
' is replaced by \u0027

Sample input:

{
    "$schema": "https://source.com/template.json#",
    "contentVersion": "1.0.0.0",
    "parameters": {
        "accountName": {
            "type": "string",
            "defaultValue": "<sampleAccountName>"
        },
        "accountType": {
            "type": "string",
            "defaultValue": "<sampleAccountType>"
        },
    },
    "variables": {
        "location": "sampleLocation",
        "account": "[parameters('accountName')]",
        "type": "[parameters('accountType')]",
    }
}

Output:

{
    "$schema": "https://source.com/template.json#",
    "contentVersion": "1.0.0.0",
    "parameters": {
        "accountName": {
            "type": "string",
            "defaultValue": "\u003csampleAccountName\u003e"
        },
        "accountType": {
            "type": "string",
            "defaultValue": "\u003csampleAccountType\u003e"
        },
    },
    "variables": {
        "location": "sampleLocation",
        "account": "[parameters(\u0027accountName\u0027)]",
        "type": "[parameters(\u0027accountType\u0027)]",
    }
}

Why is this happening and what can I do to make it not to replace the characters and write them back the same way?

Romonov
  • 6,675
  • 10
  • 39
  • 53

1 Answers1

7

Since ConvertTo-Json uses .NET JavaScriptSerializer under the hood, the question is more or less already answered here.

Here's some shameless copypaste:

The characters are being encoded "properly"! Use a working JSON library to correctly access the JSON data - it is a valid JSON encoding.

Escaping these characters prevents HTML injection via JSON - and makes the JSON XML-friendly. That is, even if the JSON is emited directly into JavaScript (as is done fairly often as JSON is a valid2 subset of JavaScript), it cannot be used to terminate the element early because the relevant characters (e.g. <, >) are encoded within JSON itself.


If you really need to turn character codes back to unescaped characters, the easiest way is probably to do a regex replace for each character code. Example:

$dReplacements = @{
    "\\u003c" = "<"
    "\\u003e" = ">"
    "\\u0027" = "'"
}

$sInFile = "infile.json"
$sOutFile = "outfile.json"

$sRawJson = Get-Content -Path $sInFile | Out-String
foreach ($oEnumerator in $dReplacements.GetEnumerator()) {
    $sRawJson = $sRawJson -replace $oEnumerator.Key, $oEnumerator.Value
}

$sRawJson | Out-File -FilePath $sOutFile
Community
  • 1
  • 1
Alexander Obersht
  • 3,007
  • 2
  • 18
  • 23
  • 1
    _Except_, that if you're posting the content as `application/json`, then one would expect `ConvertTo-JSON` to follow the JSON spec, which specifies that only the control characters, the double-quote (U+0022) and a relatively few others need to actually be escaped. Any other character does not. There's an open issue on PowerShell's GH whereby when they switched to NewtonSoftJSON in PowerShell Core, the JSON was different than in PSv5. In short, PS Core follows the JSON spec by virtue of using the default NewtonSoft.Json string escaper. – fourpastmidnight Oct 21 '19 at 19:02