1
$xml = $_GET['url']

$xmlDoc = new DOMDocument();
$xmlDoc->load($xml);

..
..

if the user put without http or https my script will be broken, is concatenation a good way to validation in this case?

jamie eason
  • 131
  • 9

2 Answers2

0

The simplest way of doing this is checking for the presence of http:// or https:// at the beginning of the string.

if (preg_match('/^http(s)?:\/\//', $xml, $matches) === 1) {
    if ($matches[1] === 's') {
        // it's https
    } else {
        // it's http
    }
} else {
    // there is neither http nor https at the beginning
}
T0xicCode
  • 3,791
  • 2
  • 27
  • 47
-1

You are using a get method. Or this is done by AJAX, or the user appends a url in the querystring You are not posting a form?

Concatenation isn't going to cut it, when the url is faulty. You need to check for this.

You can put an input with placeholder on the page, to "force" the user to use http://. This should be the way to go in HTML5.

 <input type="text" pattern="^(https?:\/\/)([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w \.-]*)*\/?$" placeholder="http://" title="URLs need to be proceeded by http:// or https://" >

This should check and forgive some errors. If an url isn't up to spec this will return an error, as it should. The user should revise his url.

$xml = $_GET['url']

$xmlDoc = new DOMDocument();
if (!preg_match(/^(https?:\/\/)?([\da-z\.-]+)\.([a-z\.]{2,6})([\/\w \.-]*)*\/?$/, $xml ) )
{
    echo 'This url is not valid.';
    exit;
}
else if (!preg_match(/^http(s)?:\/\/, $xml))
{
    //no http present
    $orgUrl = $xml;
    $xml = "http://".$orgUrl; 
    //extended to cope with https://
    $loaded = loadXML();
    if (substr($loaded, 0, 5) == "false")
    {
        //this attempt failed.
        $xml = "https://".$orgUrl;
        $loaded = loadXML();
        if (substr($loaded, 0, 5) == "false")
        {
             echo substr($loaded, 6);
             exit;
        }

    }
}
else
{  
    $loaded = loadXML();
}

function loadXML()
{
  try {
     return $xmlDoc->load($xml);
  }
  catch($Ex)
  {
     return echo 'false Your url could\'t be retrieved. Are you sure you\'ve entered it correctly?';
  }
}

You can also use curl to check the url before loading xml:

$ch = curl_init($xml);

// Send request
curl_exec($ch);

// Check for errors and display the error message
if($errno = curl_errno($ch)) {
    $error_message = curl_strerror($errno);
    echo "$error_message :: while loading url";
}

// Close the handle
curl_close($ch);

Important side-note: Using this methods to check if the url is available and than take the appropriate action can take a very long time, since the server response can take a while to return.

Mouser
  • 12,807
  • 3
  • 25
  • 51
  • @JamieEason see my updated answer. This is about all you can do when you're depend on users posting urls. – Mouser Jan 02 '15 at 02:19
  • ok, what if user's site is https://example.com but they type example.com? ur code give them http://example.com? – jamie eason Jan 02 '15 at 02:45
  • @jamieeason then my code will validate example.com as valid url, puts `http://` in front of it and tries to load the xml – Mouser Jan 02 '15 at 10:05
  • @jamieeason I can't think of any more to handle validation on server. This code should prevent the most common input errors. However it's is still better to check the validness of the url on the client side and force the user to supply the url in the format you need it, then you can do a simple url check on the server. – Mouser Jan 02 '15 at 10:15