I'm working on a chrome extension that sends the source code of a page to a server where it should be parsed.
Capturing the source code is working fine, if I display it in the console, it looks like this:
Then in order to push it to my PHP server, I first isolate the content of the body (what you've seen in the previous picture is stored in "result"):
html_content = result.querySelectorAll('body')[0].outerHTML;
html_content =JSON.stringify (html_content);
If I then display html_content in my console, I get something like this:
So now that I have a JSON object, I try to send it through this:
var xhr = new XMLHttpRequest();
xhr.open("POST", "myAPI_URL");
xhr.setRequestHeader("Content-Type", "application/json");
xhr.send(html_content);
The call to the url works but I don't get anything in $_POST. It's empty
If I try to assign a specific variable like this:
xhr.send('content='+html_content);
It doesn't work either. On the PHP side, I'm just doing this:
print_r($_POST);
And this returns an empty array.
======= UPDATE =========
Based on the feedback below, I adapted a few things and it gets better. As suggested I'm using text/plain and I keep the DOM object intact (I don't take only the body)
var xhr = new XMLHttpRequest();
xhr.open("POST", "myAPI URL");
xhr.setRequestHeader("Content-Type", "text/plain");
xhr.send(content);
If I use this on the server side:
$html_content = file_get_contents('php://input');
This variable contains the text string as expected so that's great but now if I try to parse the received html, it goes wrong.
$html_content = file_get_contents('php://input');
$dom = new DOMDocument;
$dom->loadHTML($html_content);
When doing this I get warnings like
<b>Warning</b>: DOMDocument::loadHTML(): ID ghostery-no-tracker already defined in Entity, line: 506 in <b> my url </b> on line <b>25</b><br />
It's like it doesn't understand the html correctly.
Any idea?