96

I'm trying to create a proxy server to pass HTTP GET requests from a client to a third party website (say google). My proxy just needs to mirror incoming requests to their corresponding path on the target site, so if my client's requested url is:

127.0.0.1/images/srpr/logo11w.png

The following resource should be served:

http://www.google.com/images/srpr/logo11w.png

Here is what I came up with:

http.createServer(onRequest).listen(80);

function onRequest (client_req, client_res) {
    client_req.addListener("end", function() {
        var options = {
            hostname: 'www.google.com',
            port: 80,
            path: client_req.url,
            method: client_req.method
            headers: client_req.headers
        };
        var req=http.request(options, function(res) {
            var body;
            res.on('data', function (chunk) {
                body += chunk;
            });
            res.on('end', function () {
                 client_res.writeHead(res.statusCode, res.headers);
                 client_res.end(body);
            });
        });
        req.end();
    });
}

It works well with html pages, but for other types of files, it just returns a blank page or some error message from target site (which varies in different sites).

Tarick Welling
  • 2,635
  • 3
  • 13
  • 34
Nasser Torabzade
  • 5,894
  • 7
  • 25
  • 36
  • 1
    Even though the answer uses `http`, an order of related modules from low to high abstraction are: `node`, `http`, `connect`, `express` taken from http://stackoverflow.com/questions/6040012/just-picking-up-node-should-i-use-express-or-really-learn-node-first – neaumusic Aug 08 '16 at 00:08

8 Answers8

110

I don't think it's a good idea to process response received from the 3rd party server. This will only increase your proxy server's memory footprint. Further, it's the reason why your code is not working.

Instead try passing the response through to the client. Consider following snippet:

var http = require('http');

http.createServer(onRequest).listen(3000);

function onRequest(client_req, client_res) {
  console.log('serve: ' + client_req.url);

  var options = {
    hostname: 'www.google.com',
    port: 80,
    path: client_req.url,
    method: client_req.method,
    headers: client_req.headers
  };

  var proxy = http.request(options, function (res) {
    client_res.writeHead(res.statusCode, res.headers)
    res.pipe(client_res, {
      end: true
    });
  });

  client_req.pipe(proxy, {
    end: true
  });
}
vmx
  • 6,589
  • 4
  • 19
  • 20
  • 1
    thanks, but the thing is that I need to process and/or manipulate the response of the 3rd party server, and then pass it to my client. any idea how to implement that? – Nasser Torabzade Dec 04 '13 at 07:28
  • 4
    You will need to maintain the content-type headers in that case. HTML data works as you mentioned because content-type defaults to `text/html`, for images/pdfs or any other content, ensure you pass on correct headers. I will be able to offer more help if you share what modifications you apply to the responses. – vmx Dec 04 '13 at 07:57
  • I updated my question, I set status code and headers to whatever I get from remote server (line 17), it still is not working. and modifications include resizing images, adding some HTML to pages, etc. – Nasser Torabzade Dec 04 '13 at 08:20
  • Adding arbitrary html to the response should not a problem, but how are you resizing an image data? You could temporarily save data received from 3rd server into a file, process it, and then send it across to your client. I'm still curious to know what are you trying to process. – vmx Dec 04 '13 at 17:57
  • 6
    shouldn't you be using proxy module: https://github.com/nodejitsu/node-http-proxy ? – Maciej Jankowski May 09 '14 at 17:12
  • What would be causing `Error: socket hang up`? – dman Mar 04 '15 at 22:03
  • This happens when server never sends a response, please share your test code to help debug the issue. – vmx Mar 11 '15 at 05:44
  • I tested your code, but it only works for google website. If I change it to the other website like twitter, it doesn't work. – Ng2-Fun Jul 14 '16 at 18:25
  • No reason why it would not work with other websites. Can you share what you're trying to do, with a sample snippet? – vmx Jul 19 '16 at 12:39
  • 1
    Does anyone know how to keep the request headers? – Phil Aug 18 '17 at 16:05
  • This code also seems to strip the response headers too. Or am I missing something? – Phil Aug 18 '17 at 16:12
  • You could try adding response headers before `pipe()`, `client_res.writeHead(res.statusCode, res.headers);`. I have not tested it myself though! – vmx Aug 21 '17 at 09:19
  • `pipe()` is not loading the css and scripts from the remote page – mrid Jul 14 '18 at 12:43
  • 1
    nice but not quite right... if the remote server has a redirection, this code will not work – Zibri Nov 21 '18 at 09:12
  • One problem with this is that if one side is using compression data may arrive corrupted – tkiwi Feb 12 '19 at 21:16
  • there is a comma missing in line 22, after `client_req.method` – Manticore Feb 13 '19 at 08:58
  • 1
    Thanks @manticore for pointing out, I've updated. The answer obviously does not fit all, you will need to update it to handle redirection, compression, etc. YMMV – vmx Feb 13 '19 at 09:02
  • Need to handle error also : `proxy.on('error', function(err) { console.log("error",err), client_res.end() });` – alexnode Nov 15 '20 at 10:27
31

Here's an implementation using node-http-proxy from nodejitsu.

var http = require('http');
var httpProxy = require('http-proxy');
var proxy = httpProxy.createProxyServer({});

http.createServer(function(req, res) {
    proxy.web(req, res, { target: 'http://www.google.com' });
}).listen(3000);
Blubberguy22
  • 1,295
  • 17
  • 27
bosgood
  • 1,730
  • 17
  • 20
  • 4
    I think that node-http-proxy is primarily for reverse proxying..., From outside clients to internal servers running on local IPs and non-standard ports via the reverse node proxy which accepts connections on standard ports on a public IP address. – Sunny Sep 26 '15 at 17:17
  • @Samir Sure, that's one of the things you can do with it. It's pretty flexible. – bosgood Sep 27 '15 at 18:06
13

Here's a proxy server using request that handles redirects. Use it by hitting your proxy URL http://domain.com:3000/?url=[your_url]

var http = require('http');
var url = require('url');
var request = require('request');

http.createServer(onRequest).listen(3000);

function onRequest(req, res) {

    var queryData = url.parse(req.url, true).query;
    if (queryData.url) {
        request({
            url: queryData.url
        }).on('error', function(e) {
            res.end(e);
        }).pipe(res);
    }
    else {
        res.end("no url found");
    }
}
Henry
  • 2,070
  • 22
  • 16
  • 3
    Hi henry, how to add headers for the request? – KCN Nov 11 '17 at 06:26
  • The line, `res.end(e);` will cause a `TypeError [ERR_INVALID_ARG_TYPE]: The "chunk" argument must be of type string or an instance of Buffer. Received an instance of Error` – Niel de Wet Jun 19 '20 at 12:10
7

Super simple and readable, here's how you create a local proxy server to a local HTTP server with just Node.js (tested on v8.1.0). I've found it particular useful for integration testing so here's my share:

/**
 * Once this is running open your browser and hit http://localhost
 * You'll see that the request hits the proxy and you get the HTML back
 */

'use strict';

const net = require('net');
const http = require('http');

const PROXY_PORT = 80;
const HTTP_SERVER_PORT = 8080;

let proxy = net.createServer(socket => {
    socket.on('data', message => {
        console.log('---PROXY- got message', message.toString());

        let serviceSocket = new net.Socket();

        serviceSocket.connect(HTTP_SERVER_PORT, 'localhost', () => {
            console.log('---PROXY- Sending message to server');
            serviceSocket.write(message);
        });

        serviceSocket.on('data', data => {
            console.log('---PROXY- Receiving message from server', data.toString();
            socket.write(data);
        });
    });
});

let httpServer = http.createServer((req, res) => {
    switch (req.url) {
        case '/':
            res.writeHead(200, {'Content-Type': 'text/html'});
            res.end('<html><body><p>Ciao!</p></body></html>');
            break;
        default:
            res.writeHead(404, {'Content-Type': 'text/plain'});
            res.end('404 Not Found');
    }
});

proxy.listen(PROXY_PORT);
httpServer.listen(HTTP_SERVER_PORT);

https://gist.github.com/fracasula/d15ae925835c636a5672311ef584b999

Francesco Casula
  • 22,369
  • 11
  • 121
  • 128
6

Here's a more optimized version of Mike's answer above that gets the websites Content-Type properly, supports POST and GET request, and uses your browsers User-Agent so websites can identify your proxy as a browser. You can just simply set the URL by changing url = and it will automatically set HTTP and HTTPS stuff without manually doing it.

var express = require('express')
var app = express()
var https = require('https');
var http = require('http');
const { response } = require('express');


app.use('/', function(clientRequest, clientResponse) {
    var url;
    url = 'https://www.google.com'
    var parsedHost = url.split('/').splice(2).splice(0, 1).join('/')
    var parsedPort;
    var parsedSSL;
    if (url.startsWith('https://')) {
        parsedPort = 443
        parsedSSL = https
    } else if (url.startsWith('http://')) {
        parsedPort = 80
        parsedSSL = http
    }
    var options = { 
      hostname: parsedHost,
      port: parsedPort,
      path: clientRequest.url,
      method: clientRequest.method,
      headers: {
        'User-Agent': clientRequest.headers['user-agent']
      }
    };  
  
    var serverRequest = parsedSSL.request(options, function(serverResponse) { 
      var body = '';   
      if (String(serverResponse.headers['content-type']).indexOf('text/html') !== -1) {
        serverResponse.on('data', function(chunk) {
          body += chunk;
        }); 
  
        serverResponse.on('end', function() {
          // Make changes to HTML files when they're done being read.
          body = body.replace(`example`, `Cat!` );
  
          clientResponse.writeHead(serverResponse.statusCode, serverResponse.headers);
          clientResponse.end(body);
        }); 
      }   
      else {
        serverResponse.pipe(clientResponse, {
          end: true
        }); 
        clientResponse.contentType(serverResponse.headers['content-type'])
      }   
    }); 
  
    serverRequest.end();
  });    


  app.listen(3000)
  console.log('Running on 0.0.0.0:3000')

enter image description here

enter image description here

jasoncornwall
  • 89
  • 1
  • 6
  • Struggling with all kind of errors using the proxy libraries. This above solution works, also for handling a proxy scenario where you need to pass a host name different than the address. No need to use the SNICallback. var options = { hostname: address, port: parsedPort, path: clientRequest.url, method: clientRequest.method, headers: { 'User-Agent': clientRequest.headers['user-agent'], host : parsedHost } }; – Gil Roitto Feb 14 '21 at 06:33
  • Thats amazing, I made a Node.js web proxy for my web filter bypassing website. https://incog.dev/web (Alloy option). :) – jasoncornwall Feb 15 '21 at 23:20
5

Your code doesn't work for binary files because they can't be cast to strings in the data event handler. If you need to manipulate binary files you'll need to use a buffer. Sorry, I do not have an example of using a buffer because in my case I needed to manipulate HTML files. I just check the content type and then for text/html files update them as needed:

app.get('/*', function(clientRequest, clientResponse) {
  var options = { 
    hostname: 'google.com',
    port: 80, 
    path: clientRequest.url,
    method: 'GET'
  };  

  var googleRequest = http.request(options, function(googleResponse) { 
    var body = ''; 

    if (String(googleResponse.headers['content-type']).indexOf('text/html') !== -1) {
      googleResponse.on('data', function(chunk) {
        body += chunk;
      }); 

      googleResponse.on('end', function() {
        // Make changes to HTML files when they're done being read.
        body = body.replace(/google.com/gi, host + ':' + port);
        body = body.replace(
          /<\/body>/, 
          '<script src="http://localhost:3000/new-script.js" type="text/javascript"></script></body>'
        );

        clientResponse.writeHead(googleResponse.statusCode, googleResponse.headers);
        clientResponse.end(body);
      }); 
    }   
    else {
      googleResponse.pipe(clientResponse, {
        end: true
      }); 
    }   
  }); 

  googleRequest.end();
});    
Mike Dilorenzo
  • 111
  • 2
  • 2
1

I juste wrote a proxy in nodejs that take care of HTTPS with optional decoding of the message. This proxy also can add proxy-authentification header in order to go through a corporate proxy. You need to give as argument the url to find the proxy.pac file in order to configurate the usage of corporate proxy.

https://github.com/luckyrantanplan/proxy-to-proxy-https

0

here is one that I made:

var http = require("http")
var Unblocker = require("unblocker")
var unblocker = Unblocker({})
http.createServer(function(req,res){
  unblocker(req,res,function(err){
    var headers = {"content-type": "text/html"}
    if(err){
      res.writeHead(500, headers)
      return res.end(err.stack || err)
    }
    if(req.url == "/"){
      res.writeHead(200, headers)
      return res.end(
        `
        <title>Seventh Grade by Gary Soto</title>
        <embed src="https://www.cforks.org/Downloads/7.pdf" width="1500" height="1500"/>
        `
      )
    }else{
      res.writeHead(404, headers)
      return res.end("ERROR 404: File Not Found.");
    }
  })
})
.listen(8080)

demo: view the demo:

Bob
  • 1