0

i have a question about Jsoup library ...

i have this little program , which download ,parse and get the title of an HTML page which is google.com .

import java.io.File;
import java.io.IOException;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;

public class HTMLParser{

   public static void main(String args[]) {


    // JSoup Example - Reading HTML page from URL

    Document doc;
    try {
        doc = Jsoup.connect("http://google.com/").get();
        title = doc.title();
    } catch (IOException e) {
        e.printStackTrace();
    }
    System.out.println("Jsoup Can read HTML page from URL, title : "+title);
  }
}

The program is working very well,BUT the problem is :

when i try to parse a file from the ip adress 192.168.1.1(i change the google.com to 192.168.1.1 which is the adress of the router):

        doc = Jsoup.connect("http://192.168.1.1/").get();

it does not work and shows me the error below :

org.jsoup.HttpStatusException: HTTP error fetching URL. Status=401, URL=http://192.168.1.1/
at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:537)
at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:493)
at org.jsoup.helper.HttpConnection.execute(HttpConnection.java:205)
at org.jsoup.helper.HttpConnection.get(HttpConnection.java:194)
at HTMLParser.main(HTMLParser.java:43)

first i think that the problem is related to "ussername and the password" so i change the address 192.168.1.1 to Username:Password@192.168.1.1 :

        doc = Jsoup.connect("http://username:password@192.168.1.1/").get();

but it does not work , the program read the entire line as an adress.

if someone have any idea please help me !! and thanks for everybody

M. os.i
  • 113
  • 1
  • 8
  • 1
    See [Connecting to remote URL which requires authentication using Java](http://stackoverflow.com/questions/496651/connecting-to-remote-url-which-requires-authentication-using-java). – saka1029 Jun 21 '15 at 00:07

2 Answers2

0

As with saka1029, you can request the URL with authentication. Then you use Jsoup.parse(String) to get the Document object.

Or you simply use Jsoup methods to send the request and get the response:

Getting HTML Source using Jsoup of a password protected website

Jsoup connection with basic access authentication

(I usually use javax.xml.bind.DatatypeConverter.printBase64Binary for the Base64 conversion.)

Community
  • 1
  • 1
cshu
  • 4,814
  • 23
  • 36
  • hi , @Griddoor !! , i read some solutions , but no one work with me ... in my case i need to write the username and the password in appeared auth. dialogue window or past it in the url as i mentioned, so if you can can , give a full sollution ,please!!! and thanks for reading and replying my Question. – M. os.i Jun 21 '15 at 12:24
  • @M.os.i There exists different kinds of authentications, you might have to try logging in manually and capture the http packet. Then you can write program to do the same thing. – cshu Jun 21 '15 at 12:51
0

thank you very much saka1029;Griddoor. i read what you suggest , and it helps very much,

for me i use this solution :

URL url = new URL("http://user:pass@domain.com/url");
URLConnection urlConnection = url.openConnection();

if (url.getUserInfo() != null) {
    String basicAuth = "Basic " + new String(new Base64().encode   (url.getUserInfo().getBytes()));
    urlConnection.setRequestProperty("Authorization", basicAuth);
}

InputStream inputStream = urlConnection.getInputStream();

from : Connecting to remote URL which requires authentication using Java

and used this method to read the inputstream:

StringWriter writer = new StringWriter();
IOUtils.copy(inputStream, writer);
String theString = writer.toString();

from : Read/convert an InputStream to a String

then i parse the theString with Jsoup.

Community
  • 1
  • 1
M. os.i
  • 113
  • 1
  • 8