it doesnt work either in PHP/Python
that is (as others have already pointed out) because you were using your browser's existing cookie session,
which had already solved the captcha. clear your browser cookies, get a fresh cookie session, and DO NOT SOLVE THE CAPTCHA, and Postman won't be able to log in either.
Any idea what is missing ?
several things, among them, several post login parameters (browserFlag
, loginType
,__checkbox_dscBasedLoginFlag
, and many more),
also your encoding loop here is bugged $str = $str . "$key=$value" . "&";
,
it pretty much only works as long as both keys and values only contain [a-zA-Z0-9] characters,
and since your userNamedenc contains a grave accent character, your encoding loop is insufficient.
a fixed loop would be
foreach($params as $key=>$value){
$str = $str . urlencode($key)."=".urlencode($value) . "&";
}
$str=substr($str,0,-1);
, but
this is exactly why we have the http_build_query function, that entire loop and the following trim can be replaced with this single line:
$str=http_build_query($params);
, also, seems you're trying to login without a pre-existing cookie session,
that's not going to work. when you do a GET request to the login page, you get a cookie, and a unique captcha,
the captcha answer is tied to your cookie session, and needs to be solved before you attempt to login,
you also provide no code to deal with the captcha. also, when parsing the "userName" input element, it will default to "Enter Username", which is emtied with javascript and replaced with userNamedenc, you must replicate this in PHP,
also, it will have an input element named "dscBasedLoginFlag", which is removed with javascript, you must also do this part in php,
also it has an input element named "Cert", which has a default value, but this value is cleared with javascript, do the same in php,
and an input element named "newUserRegistration", which is removed with javascript, do that,
here's what you should do: make a GET request to the login page, save the cookie session and make sure to provide it for all further requests, and parse out all the login form's elements and add them to your login request (but be careful, there is 2x form inputs, 1 belong to the search bar, only parse the children of the login form, don't mix the 2), and remember to clear/remove the special input tags to emulate the javascript, as described above,
then make a GET request to the captcha url, make sure to provide the session cookie, solve the captcha,
then make the final login request, with the captcha answer, and userNamedenc and passwordenc and all the other elements
parsed out from the login page... that should work. now, solving the captcha programmatically,
the captha doesn't look too hard, cracking it probably can be automated, but until someone actually does that,
you can use Deathbycaptcha to do it for you, however note that it isn't a free service.
here's a fully tested, working example implementation, using my hhb_curl library (from https://github.com/divinity76/hhb_.inc.php/blob/master/hhb_.inc.php ), and the Deathbycaptcha api:
<?php
declare(strict_types = 1);
header ( "content-type: text/plain;charset=utf8" );
require_once ('hhb_.inc.php');
const DEATHBYCATPCHA_USERNAME = '?';
const DEATHBYCAPTCHA_PASSWORD = '?';
$hc = new hhb_curl ( '', true );
$hc->setopt(CURLOPT_TIMEOUT,20);// im on a really slow net atm :(
$html = $hc->exec ( 'http://www.mca.gov.in/mcafoportal/login.do' )->getResponseBody (); // cookie session etc
$domd = @DOMDocument::loadHTML ( $html );
$inputs = getDOMDocumentFormInputs ( $domd, true ) ['login'];
$params = [ ];
foreach ( $inputs as $tmp ) {
$params [$tmp->getAttribute ( "name" )] = $tmp->getAttribute ( "value" );
}
assert ( isset ( $params ['userNamedenc'] ), 'username input not found??' );
assert ( isset ( $params ['passwordenc'] ), 'passwordenc input not found??' );
$params ['userName'] = ''; // defaults to "Enter Username", cleared with javascript
unset ( $params ['dscBasedLoginFlag'] ); // removed with javascript
$params ['Cert'] = ''; // cleared to emptystring with javascript
unset ( $params ['newUserRegistration'] ); // removed with javascript
unset ( $params ['SelectCert'] ); // removed with javascript
$params ['userNamedenc'] = 'hGJfsdnk`1t';
$params ['passwordenc'] = '675894242fa9c66939d9fcf4d5c39d1830f4ddb9';
echo 'parsed login parameters: ';
var_dump ( $params );
$captchaRaw = $hc->exec ( 'http://www.mca.gov.in/mcafoportal/getCapchaImage.do' )->getResponseBody ();
$params ['userEnteredCaptcha'] = solve_captcha2 ( $captchaRaw );
// now actually logging in.
$html = $hc->setopt_array ( array (
CURLOPT_POST => true,
CURLOPT_POSTFIELDS => http_build_query ( $params )
) )->exec ( 'http://www.mca.gov.in/mcafoportal/loginValidateUser.do' )->getResponseBody ();
var_dump ( $hc->getStdErr (), $hc->getStdOut () ); // printing debug data
$domd = @DOMDocument::loadHTML ( $html );
$xp = new DOMXPath ( $domd );
$loginErrors = $xp->query ( '//ul[@class="errorMessage"]' );
if ($loginErrors->length > 0) {
echo 'encountered following error(s) logging in: ';
foreach ( $loginErrors as $err ) {
echo $err->textContent, PHP_EOL;
}
die ();
}
echo "logged in successfully!";
/**
* solves the captcha manually, by doing: echo ANSWER>captcha.txt
*
* @param string $raw_image
* raw image bytes
* @return string answer
*/
function solve_captcha2(string $raw_image): string {
$imagepath = getcwd () . DIRECTORY_SEPARATOR . 'captcha.png';
$answerpath = getcwd () . DIRECTORY_SEPARATOR . 'captcha.txt';
@unlink ( $imagepath );
@unlink ( 'captcha.txt' );
file_put_contents ( $imagepath, $raw_image );
echo 'the captcha is saved in ' . $imagepath . PHP_EOL;
echo ' waiting for you to solve it by doing: echo ANSWER>' . $answerpath, PHP_EOL;
while ( true ) {
sleep ( 1 );
if (file_exists ( $answerpath )) {
$answer = trim ( file_get_contents ( $answerpath ) );
echo 'solved: ' . $answer, PHP_EOL;
return $answer;
}
}
}
function solve_captcha(string $raw_image): string {
echo 'solving captcha, hang on, with DEATBYCAPTCHA this usually takes between 10 and 20 seconds.';
{
// unfortunately, CURLFile requires a filename, it wont accept a string, so make a file of it
$tmpfileh = tmpfile ();
fwrite ( $tmpfileh, $raw_image ); // TODO: error checking (incomplete write or whatever)
$tmpfile = stream_get_meta_data ( $tmpfileh ) ['uri'];
}
$hc = new hhb_curl ( '', true );
$hc->setopt_array ( array (
CURLOPT_URL => 'http://api.dbcapi.me/api/captcha',
CURLOPT_POSTFIELDS => array (
'username' => DEATHBYCATPCHA_USERNAME,
'password' => DEATHBYCAPTCHA_PASSWORD,
'captchafile' => new CURLFile ( $tmpfile, 'image/png', 'captcha.png' )
)
) )->exec ();
fclose ( $tmpfileh ); // when tmpfile() is fclosed(), its also implicitly deleted.
$statusurl = $hc->getinfo ( CURLINFO_EFFECTIVE_URL ); // status url is given in a http 300x redirect, which hhb_curl auto-follows
while ( true ) {
// wait for captcha to be solved.
sleep ( 10 );
echo '.';
$json = $hc->setopt_array ( array (
CURLOPT_HTTPHEADER => array (
'Accept: application/json'
),
CURLOPT_HTTPGET => true
) )->exec ()->getResponseBody ();
$parsed = json_decode ( $json, false );
if (! empty ( $parsed->captcha )) {
echo 'captcha solved!: ' . $parsed->captcha, PHP_EOL;
return $parsed->captcha;
}
}
}
function getDOMDocumentFormInputs(\DOMDocument $domd, bool $getOnlyFirstMatches = false): array {
// :DOMNodeList?
$forms = $domd->getElementsByTagName ( 'form' );
$parsedForms = array ();
$isDescendantOf = function (\DOMNode $decendant, \DOMNode $ele): bool {
$parent = $decendant;
while ( NULL !== ($parent = $parent->parentNode) ) {
if ($parent === $ele) {
return true;
}
}
return false;
};
// i can't use array_merge on DOMNodeLists :(
$merged = function () use (&$domd): array {
$ret = array ();
foreach ( $domd->getElementsByTagName ( "input" ) as $input ) {
$ret [] = $input;
}
foreach ( $domd->getElementsByTagName ( "textarea" ) as $textarea ) {
$ret [] = $textarea;
}
foreach ( $domd->getElementsByTagName ( "button" ) as $button ) {
$ret [] = $button;
}
return $ret;
};
$merged = $merged ();
foreach ( $forms as $form ) {
$inputs = function () use (&$domd, &$form, &$isDescendantOf, &$merged): array {
$ret = array ();
foreach ( $merged as $input ) {
// hhb_var_dump ( $input->getAttribute ( "name" ), $input->getAttribute ( "id" ) );
if ($input->hasAttribute ( "disabled" )) {
// ignore disabled elements?
continue;
}
$name = $input->getAttribute ( "name" );
if ($name === '') {
// echo "inputs with no name are ignored when submitted by mainstream browsers (presumably because of specs)... follow suite?", PHP_EOL;
continue;
}
if (! $isDescendantOf ( $input, $form ) && $form->getAttribute ( "id" ) !== '' && $input->getAttribute ( "form" ) !== $form->getAttribute ( "id" )) {
// echo "this input does not belong to this form.", PHP_EOL;
continue;
}
if (! array_key_exists ( $name, $ret )) {
$ret [$name] = array (
$input
);
} else {
$ret [$name] [] = $input;
}
}
return $ret;
};
$inputs = $inputs (); // sorry about that, Eclipse gets unstable on IIFE syntax.
$hasName = true;
$name = $form->getAttribute ( "id" );
if ($name === '') {
$name = $form->getAttribute ( "name" );
if ($name === '') {
$hasName = false;
}
}
if (! $hasName) {
$parsedForms [] = array (
$inputs
);
} else {
if (! array_key_exists ( $name, $parsedForms )) {
$parsedForms [$name] = array (
$inputs
);
} else {
$parsedForms [$name] [] = $tmp;
}
}
}
unset ( $form, $tmp, $hasName, $name, $i, $input );
if ($getOnlyFirstMatches) {
foreach ( $parsedForms as $key => $val ) {
$parsedForms [$key] = $val [0];
}
unset ( $key, $val );
foreach ( $parsedForms as $key1 => $val1 ) {
foreach ( $val1 as $key2 => $val2 ) {
$parsedForms [$key1] [$key2] = $val2 [0];
}
}
}
return $parsedForms;
}
example usage: in terminal, write php foo.php | tee test.html
, after a few seconds it will say something like:
the captcha is saved in /home/captcha.png
waiting for you to solve it by doing: echo ANSWER>/home/captcha.txt
then look at the captcha in /home/captcha.png , solve it, and write in another terminal: echo ANSWER>/home/captcha.txt
, now the script will log in, and dump the logged in html in test.html, which you can open in your browser, to confirm that it actually logged in, screenshot when i run it: https://image.prntscr.com/image/_AsB_0J6TLOFSZuvQdjyNg.png
also note that i made 2 captcha solver functions, 1 use the deathbycaptcha api, and wont work until you provide a valid and credited deathbycaptcha account on line 5 and 6, which is not free, the other 1, solve_captcha2, asks you to solve the captcha yourself, and tells you where the captcha image is saved (so you can go have a look at it), and what command line argument to write, to provide it with the answer. just replace solve_captcha
with solve_captcha2
on line 28, to solve it manually, and vise-versa. the script is fully tested with solve_captcha2, but the deathbycaptcha solver is untested, as my deathbycatpcha account is empty (if you would like to make a donation so i can actually test it, send 7 dollars to paypal account divinity76@gmail.com with a link to this thread, and i will buy the cheapest deathbycaptcha credit pack and actually test it)
- disclaimer: i am not associated with deathbycaptcha in any way (except that i was a customer of theirs a couple of years back), and this post was not sponsored.