0

In my server.js file, I make an HTTP GET request that is suppose to return xml. When I log the GET request's response to the console, it is gibberish containing lots of question marks and black diamonds as you can see in the photo below:enter image description here

When I take the same url that I'm using in my GET request and I open it in the browser it automatically downloads a gzip file, which after it's unzipped contains a legible xml file with the data (inside my text editor).

How do I get the xml in its correct form inside my server.js file? I need to make use of it in my program, not inside a text editor (obviously).

Here is my GET request:

axios.get('http://www2.jobs2careers.com/feed.php?id=1237-2595&c=1&pass=HeahE0W1ecAkkF0l')
  .then(function(response) {
    console.log(response.data);
  });

I've tried to extract the gzip file using the targz library as shown below:

axios.get('http://www2.jobs2careers.com/feed.php?id=1237-2595&c=1&pass=HeahE0W1ecAkkF0l')
  .then(function(response) {
    targz().extract(response.data, '/data', function(err){
      if (err) {
        console.log('Something is wrong ', err.stack);
      }
      console.log('Job done!');
    });
  });

I get an error in the console saying : "Path must be a string without null bytes". Should I be using the extract method from targz or am I just using it incorrectly? I'm trying to "extract" or unzip the response.data.

Mjuice
  • 1,156
  • 2
  • 11
  • 29
  • The answer is in your question: the response isn't an XML file, it's a gzip file. You need a node module that can extract it. Here's one: https://www.npmjs.com/package/tar.gz – Chris G Nov 17 '16 at 23:27
  • You have also zlib https://nodejs.org/api/zlib.html#zlib_class_zlib_gzip – Emilio Grisolía Nov 17 '16 at 23:31
  • possible duplicate of: http://stackoverflow.com/questions/12148948/how-do-i-ungzip-decompress-a-nodejs-requests-module-gzip-response-body – jgozal Nov 17 '16 at 23:34
  • I tried using the targz node module, but I don't think I'm using it correctly. I updated the question with my latest code. – Mjuice Nov 17 '16 at 23:46

2 Answers2

2

Based on this: Simplest way to download and unzip files in Node.js cross-platform?

var feedURL = 'http://www2.jobs2careers.com/feed.php?id=1237-2595&c=1&pass=HeahE0W1ecAkkF0l';

var request = require('request'),
    zlib = require('zlib'),
    fs = require('fs'),
    out = fs.createWriteStream('./feed.xml');

request(feedURL).pipe(zlib.createGunzip()).pipe(out);
Community
  • 1
  • 1
Chris G
  • 7,957
  • 4
  • 17
  • 29
0

From the updated code, it appears you that the first parameter (response.data) needs to be set to a path on the filesystem of a gzip file, hence the null byte error. I would consider writing to the file system, then extract, or another module that would let you extract from a url.

When you do get the XML out of the extracted gzip file (which you are on the right path, no pun intended), you can use a node module such as xml2js, which will parse the xml into a Javascript object, and makes it quite easy to work with.

  • So I figured out how to unzip the file and now it is saved in my root directory as feed.xml. I want to use xml2js to turn that xml into JSON. In the docs for xml2js, they give this example: var parseString = require('xml2js').parseString; var xml = "Hello xml2js!" parseString(xml, function (err, result) { console.dir(result); }); The variable xml is set to a string. How do I set xml to the value of my file called feed.xml? – Mjuice Nov 18 '16 at 00:27
  • I'm assuming you have the feed.xml file saved locally at this point. From here you can use the [fs.readFileSync()](https://nodejs.org/api/fs.html#fs_fs_readfilesync_file_option) to set as your xml variable, then begin with parsing it as a string. – Brandon R Him Nov 18 '16 at 18:22
  • If do decide to go with readFileSync() or readFile(), don't forget to specify the encoding (most likely utf8) as the second parameter, otherwise you'll get a buffer. – Brandon R Him Nov 18 '16 at 18:26
  • When I tried readFile, I did specify encoding UTF8 but I still got a buffer anyway. I opened another question about that and no one could answer it either. It even got downvoted... haha – Mjuice Nov 18 '16 at 18:28
  • Shouldn't be the case, I've done this before with no issues. Do you have implementation code? – Brandon R Him Nov 18 '16 at 18:34
  • Actually I have it here: https://gist.github.com/MarcusHurney/bcbacbcc978985317c938c33afabfa83 – Mjuice Nov 18 '16 at 18:37