There can be different levels of automation required depending on how complex your sign in flow is and how the underlying system is built.
Do it via an API
First, don't rely on screen scraping for anything. It's BAD and prone to failure. When the underlying application is updated, no one thinks about screen scrapers and things change. If there's a REST API or some other type of RPC (Remote Procedure Call) to be used, use that instead. If there's not, ask for an API. Only after that should you try screen scraping.
Low level HTTP requests
You might be able to emulate the HTTP requests without trying to emulate a browser completely. Complete the requests in a browser first while the Network Monitor in your Developer Tools is open. Find the minimal number of requests you need. Sometimes this is just a POST
to /login
with username
and password
fields. Sometimes you will need to store a cookie and then request the required page with your user session.
Use needle or the more common, but more heavyweight request.
Headless Browsers
Headless browsers are the first step into UI and allow you to not worry about what the backend HTTP requests do. You tell the API to fill in the login
field and password
field and submit the form. A headless browser will do the background work for you, like cookies and redirects, and returns a rendered web page.
Use Zombie.js, PhantomJS, CasperJS.
Full Browser Automation
More complex web site automation sometimes requires a full browser to work correctly. This is usually when you are relying on heavily Javascript rendered webpages and more advanced user interaction.
Webdriver is a standard API for controlling a browser. A Webdriver client is a languages API implementation that can talk to a Webdriver server. The Webdriver server launches a full browser instance and converts the API calls into actual browser actions.
Webdriver.io and Selenium Standalone Server will cover most of what you need.
Internet Explorer has a native server available.
Chrome release their own native webdriver server too.