How can i crawl web data that not in tags(class name is same)

Question

Sorry. I have asked a question like this. After that i still have problem about data not in tag. A few different the question i asked (How can i crawl web data that not in tags)

<div class="bbs" id="main-content">
    <div class="metaline">
        <span class="article-meta-tag">
             author
        </span>
        <span class="article-meta-value">
             Jorden 
        </span>
    </div>
    <div class="metaline">
        <span class="article-meta-tag">
            board
        </span>
        <span class="article-meta-value">
            NBA
        </span>
    </div>

I am here

</div>

I only need

I am here

`I am here` is still in a `div` tag (main-content), it's just not in CERTAIN div tags (class=metaline). Knowing that, this question might help you: https://stackoverflow.com/questions/5041008/how-to-find-elements-by-class?rq=1 — Bing, Jun 04 '17 at 22:06

score 1 · Accepted Answer · answered Jun 04 '17 at 22:16

The string is a child of the main div of type NavigableString, so you can loop through div.children and filter based on the type of the node:

from bs4 import BeautifulSoup, NavigableString
[x.strip() for x in soup.find("div", {'id': 'main-content'}).children if isinstance(x, NavigableString) and x.strip()]
# [u'I am here']

Data:

soup = BeautifulSoup("""<div class="bbs" id="main-content">
    <div class="metaline">
        <span class="article-meta-tag">
             author
        </span>
        <span class="article-meta-value">
             Jorden 
        </span>
    </div>
    <div class="metaline">
        <span class="article-meta-tag">
            board
        </span>
        <span class="article-meta-value">
            NBA
        </span>
    </div>
I am here
</div>""", "html.parser")

score 0 · Answer 2 · answered Jun 04 '17 at 22:10

0

soup = BeautifulSoup(that_html)
div_tag = soup.div
required_string = div_tag.string

go thought this documentation

answered Jun 04 '17 at 22:10

Rajesh

176
1
13

How can i crawl web data that not in tags(class name is same)

2 Answers2