25

In my application, I am showing epub HTML files in webview using EPUBLIB. My problem is that I want to use bookmark functionality for my epub reader. For that I want to fetch text from webview which is showing page from my epub's HTML file and then use that text in my bookmark activity to show the user what they have bookmarked. How can I achieve this?

Rohit
  • 2,410
  • 6
  • 26
  • 40

5 Answers5

46

Getting the plain text content from a webview is rather hard. Basically, the android classes don't offer it, but javascript does, and Android offers a way for javascript to pass the information back to your code.

Before I go into the details, do note that if your html structure is simple, you might be better off just parsing the data manually.

That said, here is what you do:

  1. Enable javascript
  2. Add your own javascript interface class, to allow the javascript to communicate with your Android code
  3. Register your own webviewClient, overriding the onPageFinished to insert a bit of javascript
  4. In the javascript, acquire the element.innerText of the tag, and pass it to your javascript interface.

To clarify, I'll post a working (but very rough) code example below. It displays a webview on the top, and a textview with the text-based contents on the bottom.

package test.android.webview;

import android.app.Activity;
import android.os.Bundle;
import android.webkit.WebView;
import android.webkit.WebViewClient;
import android.widget.TextView;

public class WebviewTest2Activity extends Activity {
    /** Called when the activity is first created. */
    @Override
    public void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.main);

        WebView webView = (WebView) findViewById(R.id.webView);
        TextView contentView = (TextView) findViewById(R.id.contentView);

        /* An instance of this class will be registered as a JavaScript interface */ 
        class MyJavaScriptInterface 
        { 
            private TextView contentView;

            public MyJavaScriptInterface(TextView aContentView)
            {
                contentView = aContentView;
            }

            @SuppressWarnings("unused") 

            public void processContent(String aContent) 
            { 
                final String content = aContent;
                contentView.post(new Runnable() 
                {    
                    public void run() 
                    {          
                        contentView.setText(content);        
                    }     
                });
            } 
        } 

        webView.getSettings().setJavaScriptEnabled(true); 
        webView.addJavascriptInterface(new MyJavaScriptInterface(contentView), "INTERFACE"); 
        webView.setWebViewClient(new WebViewClient() { 
            @Override 
            public void onPageFinished(WebView view, String url) 
            { 
                view.loadUrl("javascript:window.INTERFACE.processContent(document.getElementsByTagName('body')[0].innerText);"); 
            } 
        }); 

        webView.loadUrl("http://shinyhammer.blogspot.com");
    }
}

Using the following main.xml:

<?xml version="1.0" encoding="utf-8"?>
<LinearLayout xmlns:android="http://schemas.android.com/apk/res/android"
    android:layout_width="fill_parent"
    android:layout_height="fill_parent"
    android:orientation="vertical" >

    <WebView
        android:id="@+id/webView"
        android:layout_width="match_parent"
        android:layout_height="fill_parent"
        android:layout_weight="0.5" />

    <TextView
        android:id="@+id/contentView"
        android:layout_width="match_parent"
        android:layout_height="fill_parent"
        android:layout_weight="0.5" />


</LinearLayout>
Paul-Jan
  • 16,057
  • 58
  • 87
  • Can you explain this line in details ? view.loadUrl("javascript:window.INTERFACE.processContent(document.getElementsByTagName('body')[0].innerText);"); – Rohit Mar 06 '12 at 09:45
  • It's step 4 from the explanation. From left to right, it (a) loads a url that (b) simply injects some javascript that (c) calls the `procesContent()` method of the custom javascript interface class `INTERFACE`, registered from the android code, passing (d) the `innerText` property of the body text of the page currently showing. ***If you have specific questions, ask away!*** – Paul-Jan Mar 06 '12 at 09:51
  • As a sidenote, I deliberately included an example you can copy paste into a new android project to test it out. If you are new to this stuff, simply stepping through source might be enlightening. It _is_ fairly complex stuff, as it is two different techniques (android webview customization, javascript fiddling) coming together. – Paul-Jan Mar 06 '12 at 09:54
  • Thank you. :) it rly helped me, and example worked as u said :) – Rohit Mar 06 '12 at 09:59
  • @Paul-Jan thanks, though a question here, will there be two instance variable html/string? ie one originally containd in webview and other passed by js to interface? – duckduckgo Sep 24 '13 at 12:12
  • @rohit - how did you made the bookmark then ? from here you can extract the text from the web view. I think when you save the bookmark you save some part of the text which is bookmarked. Later you search it in the given spine and then use some javascript method to take you to that particular offset i.e. from text to offset. Can you share how were you able to achieve "search" and "text to offset". The innerText will give all the text of the current spine. Is there any javascript method which can only give the text displayed in the current view.(e.g. content of only page no. 2) – Arunavh Krishnan Aug 19 '14 at 12:48
  • could you plz anyone give me newer api solution above to 17 api; because it's not working in kitkat. – Kunwar Avanish Apr 13 '16 at 08:38
  • 1
    For the benefit of others: the method processContent(...) specified in the answer of Paul-Jan works only if @JavascriptInterface annotation is specified for the method if your target sdk version is >=17 as per https://developer.android.com/guide/webapps/webview.html#BindingJavaScript – mvsagar Jun 03 '16 at 09:47
  • I/chromium: [INFO:CONSOLE(1)] "Uncaught TypeError: window.INTERFACE.processContent is not a function", source: (1) ....I get this error in Android 6.0. What am I missing? – Bala Vishnu Aug 11 '16 at 10:32
  • Figured out the issue, just add @JavascriptInterface above processContent(...) for Kitkat and above as mvsagar said – Bala Vishnu Aug 11 '16 at 10:46
9
wvbrowser.evaluateJavascript(
    "(function() { return ('<html>'+document.getElementsByTagName('html')[0].innerHTML+'</html>'); })();",
     new ValueCallback<String>() {
        @Override
        public void onReceiveValue(String html) {
            Log.d("HTML", html); 
            // code here
        }
});
Balaji M
  • 91
  • 1
  • 3
  • NB that both methods still work on kitkat+, just `evaluateJavascript` is preferred because it's got a callback so is more easily asynchronous (if you need a return value especially)... – rogerdpack Oct 10 '17 at 04:19
  • braces after the return statement `return ()` worked for me. Anyways works like a charm! – Sachin K Pissay Jan 31 '20 at 10:38
4

The solution provided above provides the text using innerText property which will return you all the text in the webView. The solution that I propose below will help you extract the text from visible part of the webView on the screen.

Step 1: It requires the help of javaScript, hence first enable the javascript.

webView.addJavascriptInterface(new IJavascriptHandler(getActivity().getApplicationContext()),     "Android"); //if your class extends a Fragment class

or

view.addJavascriptInterface(new IJavascriptHandler(this), "Android"); //if your class extends Activity.

Step 2: Create a javaInterface inner class.

final class IJavascriptHandler {

    Context mContext;
    IJavascriptHandler(Context c) {
    mContext = c;
}

//API 17 and higher required you to add @JavascriptInterface as mandatory before your method.   
@JavascriptInterface 
public void processContent(String aContent) 
{ 
   //this method will be called from within the javascript method that you will write.
   final String content = aContent;
   Log.e("The content of the current page is ",content);
} 
}

Step 3: Now you have to add the javascript method. You'll write the method as a string and then load it. The method returns the text based on the parameter provided to it. So, you would need 2 strings. One will load the javascript method and the other will call it.

Method to load the javascript method.

String javaScriptToExtractText = "function getAllTextInColumn(left,top,width,height){"
                +   "if(document.caretRangeFromPoint){"
                +   "var caretRangeStart = document.caretRangeFromPoint(left, top);"
                +   "var caretRangeEnd = document.caretRangeFromPoint(left+width-1, top+height-1);"
                +   "} else {"
                +   "return null;"
                +   "}"
                +   "if(caretRangeStart == null || caretRangeEnd == null) return null;"
                +   "var range = document.createRange();"
                +   "range.setStart(caretRangeStart.startContainer, caretRangeStart.startOffset);"
                +   "range.setEnd(caretRangeEnd.endContainer, caretRangeEnd.endOffset);"
                +   "return range.toString();};";

Method to call the above function.

String javaScriptFunctionCall = "getAllTextInColumn(0,0,100,100)";

//I've provided the parameter here as 0,0 i.e the left and top offset and then 100, 100 as width and height. So, it'll extract the text present in that area.

Step 4: Now, you need to load the above 2 javascripts.

webView.loadUrl("javascript:"+ javaScriptToExtractText);
//this will load the method.


view.loadUrl("javascript:window.Android.processContent("+javaScriptFunctionCall+");");
//this will call the loaded javascript method.

Enjoy.

Evgeniy Berezovsky
  • 16,237
  • 8
  • 74
  • 131
4

The only thing that comes to my mind in this case is to use javascript. Doing a quick search I found android.webkit.WebView.addJavascriptInterface.

You want to study the "addJavascriptInterface" which in the end will help you solve the problem

Chris Morgan
  • 73,264
  • 19
  • 188
  • 199
Th0rndike
  • 3,360
  • 3
  • 19
  • 39
  • I dont know much about js,html,etc.. Can you tell me any good tutorial that I can follow :) – Rohit Mar 06 '12 at 09:35
  • Watching the answer given by Paul-Jan I see that i was on the right track. If you follow his instructions you might be able to make it work. I suggest that you do some research: internet is full of tutorials for javascript and html, and today these skills are a MUST for a developer. – Th0rndike Mar 06 '12 at 09:42
  • :D yeah , I started searching it already, thank you very much for guiding in right direction. – Rohit Mar 06 '12 at 09:48
0

Why don't you fetch the text with EPUBLIB from the book directly?

You got that html with the help of EPUBLIB isn't it? How did you put that in the webvieuw? I see no example.

Helper
  • 81
  • 2
  • yeah you are right, I got the html file as string but with all html tags that I must pass to webview. I only want some part,means lets say only 3rd paragraph from that string, I couldnt do that with ur method, right ? – Rohit Mar 06 '12 at 10:05
  • You can just parse that out. First determine the position of the first

    . Then make a substring() of the text from that tag. Repeat until the n'th tag found. Now determine the end of the paragraph and get a final substring().

    – Helper Mar 06 '12 at 10:36
  • thats what Paul answered in different and easy way. your method would be helpful for developers like me who dnt know much about JS, but if you know that todays most topmost things in world are HTML,JS,CSS and android is offering such a good functionality to add js in your java code, we must make use of that. its my personal opinion :) – Rohit Mar 06 '12 at 11:11
  • Even if you use the javascript interface you -only- get the innerText() and you still have to parse the paragraph out. So why not do it right away? – Helper Mar 06 '12 at 11:36
  • I dont know, but may be there could be some methods in JS that ll give me text from

    tag directly. .

    – Rohit Mar 06 '12 at 12:00