8

I am trying to use Headless Chrome to generate a PDF file from a complex HTML file (contains images, SVGs, etc.). I am able to use wkhtmltopdf.exe on Cloud Service (Windows) to generate simple PDF file, but I really need Chrome to produce PDFs as close as possible to the HTML + SVG + Image.

I was hoping to be able to run Headless Chrome in Azure Cloud Service or Azure Functions, but I cannot get it to work. I suppose this is due to restrictions on GDI. I was able to run my code and Headless Chrome in the Azure Emulator on my own machine, but once it is deployed nothing works.

Below is the code I am currently running in Azure Functions (for Windows). I am using Puppeteer to take a screenshot of example.com. If I can get this to work, I suppose that generating PDF will become easy.

const fs = require('fs');
const path = require('path');
const puppeteer = require('puppeteer');
const os = require('os');

module.exports = function (context, req) {
    function failureCallback(error) {
        context.log("--> Failure = '" + error + "'");
    }

    const chromeDir = path.normalize(__dirname + "/../node_modules/puppeteer/.local-chromium/win64-508693/chrome-win32/chrome.exe");
    context.log("--> Chrome Path = " + chromeDir);

    const dir = path.join(os.tmpdir(), '/screenshots');

    if (!fs.existsSync(dir)){
        fs.mkdirSync(dir);
    }

    const screenshotPath = path.join(dir, "example.png");
    context.log("--> Path = " + screenshotPath);

    let browser, page;
    puppeteer.launch({ executablePath: chromeDir, headless: true, args: [ '--no-sandbox', '--single-process', '--disable-gpu' ] })
        .then(b => {
            context.log("----> 1");
            browser = b;
            return browser.newPage();
        }, failureCallback)
        .then(p => {
            context.log("----> 2");
            page = p;
            return p.goto('https://www.example.com');
        }, failureCallback)
        .then(response => {
            context.log("----> 3");
            return page.screenshot({path: screenshotPath, fullPage: true});  
        }, failureCallback)
        .then(r => {
            browser.close();

            context.res = {
                body: "Done!"
            };

            context.done();            
        }, failureCallback);
};

Below is the log when trying to execute the script.

2017-12-18T04:32:05  Welcome, you are now connected to log-streaming service.
2017-12-18T04:33:05  No new trace in the past 1 min(s).
2017-12-18T04:33:11.400 Function started (Id=89b31468-8a5d-43cd-832f-b641216dffc0)
2017-12-18T04:33:20.578 JavaScript HTTP trigger function processed a request.
2017-12-18T04:33:20.578 --> Chrome Path D:\home\site\wwwroot\node_modules\puppeteer\.local-chromium\win64-508693\chrome-win32\chrome.exe
2017-12-18T04:33:20.578 --> Path = D:\local\Temp\screenshots\example.png
2017-12-18T04:33:20.965 --> Failure = 'Error: spawn UNKNOWN'
2017-12-18T04:33:20.965 ----> 2

The error "Failure = 'Error: spawn UNKNOWN'" is not clear. I made sure that the path I am using is correct using Kudu and PowerShell.

I am looking for a way to run Chrome on Azure Cloud Service and/or Azure Functions (for Windows - in order to use my existing App Service plan). Anybody has also attempted to run Headless Chrome in Azure? I am open to any ideas which would help me to get this script to work?

Martin
  • 35,202
  • 58
  • 178
  • 268
  • You may check these suggestions incase if you haven't checked earlier and see if that helps: https://social.msdn.microsoft.com/Forums/azure/en-US/883e5980-35ad-400d-b1f7-3fbf428ac39f/access-to-headless-chrome-in-the-azure-functions-environment?forum=AzureFunctions and https://stackoverflow.com/questions/47265315/running-headless-chrome-in-an-microsoft-azure-web-app – AshokPeddakotla-MSFT Dec 14 '17 at 05:07
  • @Ashok - I looked at those two links and none of them provide a solution or idea of how to make this happen in Azure Cloud Service or Azure Functions. :( – Martin Dec 14 '17 at 23:15
  • Cloud Services (with Roles) have no GDI restrictions like App Service has. It should work just fine, double check your paths, enable RDP and remote into the worker if that helps with debugging. – evilSnobu Dec 22 '17 at 00:21
  • Or just try to run Azure Functions inside a container. – kamil-mrzyglod Feb 17 '19 at 08:08

3 Answers3

5

I would recommend to use https://www.browserless.io/ so you don't have to run the chrome.exe in the app service.

Replace puppeteer.launch with puppeteer.connect

const browser = await puppeteer.connect({
  browserWSEndpoint: 'wss://chrome.browserless.io/'
});
MVafa
  • 431
  • 1
  • 5
  • 6
  • Another alternative to BrowserLess is https://headlesstesting.com - provides Chrome and other browsers, compatible with Puppeteer and Playwright – Jochen Mar 09 '20 at 15:26
2

I'm not sure about the usage of Headless Chrome, but the sandbox that Azure Functions runs in has problems generating PDFs from HTML due to some GDI restrictions.

Consider trying your task in Azure Functions on Linux. While this is still in preview, it does not utilize a sandbox, so if you can get headless chrome working on it then you may have more luck with the PDF generation.

Connor McMahon
  • 1,243
  • 4
  • 14
  • Hello Conor - I read about Azure Functions on Linux. My current web app is running on Cloud Service on Windows. I find it unfortunate to have to create a new service plant simply to generate PDF file. That would be an expensive file generator at $50+ a month. – Martin Dec 18 '17 at 02:59
  • Martin, that is very fair. I believe that there is work being done to allow Functions on Linux work on Consumption plans, which would hopefully eliminate the cost concern. Unfortunately, I don't have a date for that. – Connor McMahon Dec 18 '17 at 17:07
0

Azure allows NodeJS:

you can do it in NodeJS using Phantom (instead of chrome since you wont have access to any browsers - nor will you be able to run them on azure web apps) see the example - its in hosted on google firebase but you can easily apply it to your NodeJS project:

https://stackoverflow.com/a/51828577/6306638

IIS server on a Azure VM is your only alternative if you NEED Chrome.

Let me know if you need any help with this!

Sagar Patel
  • 841
  • 7
  • 13