-1

Hi how can i search data from other website using curl and php. i want to search imei number from this website https://www.example.com/xxx

this is what i have tried so far

$imei = '013887009861498';

$cookie_file_path = "cookies/cookiejar.txt"; 
$fp = fopen("$cookie_file_path","w") or die("<BR><B>Unable to open cookie file $mycookiefile for write!<BR>");
fclose($fp); 


    $url="https://example.com/xxx"; 
    $agent = "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; .NET CLR 1.1.4322)";
        $ch = curl_init(); 
        curl_setopt($ch, CURLOPT_URL,$url);
        curl_setopt($ch, CURLOPT_POST, 1);
        curl_setopt($ch, CURLOPT_POSTFIELDS,$imei);
        curl_setopt($ch, CURLOPT_USERAGENT, $agent);
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 
        curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
        curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
        curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file_path);
        curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file_path);
        $result = curl_exec ($ch);

        echo $result ;
abid
  • 25
  • 4
  • 1
    You should describe what happens when you run this scrip. Do you get an error? May be you will want to trace the data exchanged with a tool like Fiddler (WireShark/tcpdump will not work because of https). Then you should be able to see what happens. – Adrian W Jun 08 '18 at 07:53

1 Answers1

0

(this is not a full answer, but too long to be a comment. i can't be arsed to figure out all the small details for you)

there are several different problems here, the first is how to do a POST request with php/curl, of which you can find an example here.

another problem, is how to parse HTML in PHP, of which there are several options listed here. (i highly recommend the DOMDocument & DOMXPath combo)

another problem, is how to get past CAPTCHA challenges in PHP, 1 solution is to use the deathbycaptcha API (which is a paid service, by the way), you can find an example of that here.

another problem is that they're using 3 different CSRF-like tokens, called __VIEWSTATE, __EVENTVALIDATION, and hdnCaptchaInstance, all of which must be parsed out and submitted with the captcha answer. also you need to handle cookies, as the CSRF tokens and captcha is tied to your cookie session (luckily you can let curl handle cookies automatically with CURLOPT_COOKIEFILE )

hanshenrik
  • 15,263
  • 3
  • 28
  • 61
  • 1
    And *another* problem is that if you get around all of these, expect them to eventually block your requests for violating the TOS. – ceejayoz Jun 08 '18 at 19:40
  • 1
    @ceejayoz yeah that will probably happen too, eventually. sure you may get by with using TOR for a while, but eventually they may subscribe to the (public) tor exit node list, and block all TOR ips. then you may subscribe to something like https://microleaves.com/ , which claims they got 26 million ip addresses, but i wonder if it wouldn't just be easier to just make a deal with t-mobile to get an actual API for this stuff (won't be free, but hey, neither is ips/captcha breaking services/scraping script maintenance) – hanshenrik Jun 08 '18 at 19:49