23

Basically what I need is a way to automatize the result of the following operations:

  1. open a new tab;

  2. open the Network tab in the developer tools;

  3. load an URL;

  4. select "Save All as HAR".

Often, proposed solutions involves the use of PhantomJS, browsermob-proxy, or pcap2har; those won't fit my case since I need to work with SPDY traffic.

I tried to dive into the Google Chrome Extensions API and indeed I managed to automatize some tasks, but still no luck for what concerns the HAR files generation. Now this method is particularly promising but I still can't figure out how would I use it.

In other words, I need something like this experiment from the Google guys. Note the following:

We used Chrome's remote debugging interface with a custom client that starts up the browser on the phone, clears its cache and other state, initiates a web page load, and receives the Chrome developer tools messages to determine the page load times and other performance metrics.

Any ideas?


Solution

For the curious, I ended up with a Node.js module that automates such kind of tests: chrome-har-capturer. This also gave me the opportunity to dig deeper into the Remote Debugging Protocol and to write a lower-level Node.js interface for general-purpose Chrome automation: chrome-remote-interface.

cYrus
  • 2,713
  • 5
  • 24
  • 44

2 Answers2

11

The short answer is, there is no way to get at the data you are after directly. The getHAR method is only applicable to extensions meant to extend DevTools itself. The good news is, you can construct the HAR file yourself without too much trouble - this is exactly what phantom.js does.

  1. Start Chrome with remote debugging
  2. Connect to Chrome on the debugging port with a websocket connection
  3. Enable "Network" debugging, you can also clear cache, etc - see Network API.
  4. Tell the browser to navigate to the page you want to capture, and Chrome will stream all the request meta-data back to you.
  5. Massage the network data into HAR format, ala phantom.js
  6. ...
  7. Profit.

For a head start, I have a post that with sample Ruby code that should you get started with steps 1-4: http://www.igvita.com/2012/04/09/driving-google-chrome-via-websocket-api/

igrigorik
  • 8,543
  • 2
  • 27
  • 29
  • Quite funny that a _real Google guy_ answered my question! :) In these days I got a glimpse of the topics you mentioned, and finally you provided me the glue. Let me try it out, I'll be back soon with some feedback... – cYrus Nov 24 '12 at 12:51
  • Ok, this is definitely the way to go. Now I have a nodejs script that launch Chrome, start tcpdump, open a WebSocket, do some setup, load an URL and finally dump the HAR file... I followed the example of PhantomJS you provided, the HAR spec and the Network API, but there's still something about the timings that I don't get, in particular: how can I fill `timings.receive` field? From what I've understood: a request starts at `requestTime`; waiting from server starts at `requestTime + sendEnd` and ends at `Network.responseReceived` (`timestamp`), is this correct? – cYrus Nov 25 '12 at 13:50
  • Or maybe the whole response is received at `Network.responseReceived` (`timestamp`)? – cYrus Nov 25 '12 at 14:10
  • responseReceived should contain all the timing data within (as part of NavTiming). You definitely don't want to be in the business of calculating these times yourself. – igrigorik Nov 26 '12 at 22:25
  • I hope so, but all I have is this [ResourceTiming](https://developers.google.com/chrome-developer-tools/docs/protocol/1.0/network#type-ResourceTiming) object and I can't figure out how to determine when a response is fully received, there's no field like `responseEnd`. – cYrus Nov 27 '12 at 00:19
  • It looks like you might want to also listen for the Network.loadingFinished event. If you filter for Network.responseReceived, then you will get the notification when the header is available, and you can save the request ID + other meta-data about it, and then correlate it with loadingFinished to get final timestamp. – igrigorik Nov 27 '12 at 08:31
  • Great, everything should be fine now. Thanks for your help! – cYrus Nov 28 '12 at 22:56
  • Glad to hear. If you have your code up anywhere, would love to take a look! – igrigorik Nov 29 '12 at 19:32
  • Thanks a lot for this solution. I also using "chrome-har-capturer" for this. But I have one question - is it possible to make the same on some server (not desktop computer), where I have only some console access. I mean is Chrome depends on really showing browser on desktop, or it could work in console too? Hope you understand me :) – Levsha Apr 29 '16 at 03:50
  • For those that just read comments, @cYrus put the code he came up with [on github as chrome-har-capturer](https://github.com/cyrus-and/chrome-har-capturer), which he mentions at the bottom of his question now – Brad Parks Sep 21 '17 at 23:35
2

By now there's a browser plugin to do that: https://github.com/devtools-html/har-export-trigger

It uses the WebExtensions DevTools API and I got it to work with both Firefox and Chrome.

See my code for Chrome here: https://github.com/theri/web-measurement-tools/blob/master/load/load_url_using_chrome.py#L175

Automatically installing the plugin in Chrome is a bit more complicated than in Firefox, but feasible - I extracted the plugin archive locally and then link to it in chrome_prefs.json (see same repository).