How to print the inner contents of a div with BeautifulSoup?

Question

The html of the website looks like:

<div class="breed-image">
    <img src = "link to image">
</div>

When I do this:

soup = BeautifulSoup(response.text, 'lxml')
for link in soup.find_all(class_='breed-image'):

    print(link)

All it does is print out:

<div class="breed-image">
</div>

I have also tried print(link.text)

All that does is print out:

None

Any kind of help is appreciated, thanks!

score 0 · Answer 1 · answered Jan 15 '18 at 22:11

0

Couple of options:

>>> soup.img['src']
'link to image'
>>> for link in soup.find_all('img'):
...     print(link['src'])
...
link to image

answered Jan 15 '18 at 22:11

Jonathon McMurray

2,379
1
7
22

For the first option it gives me the error `TypeError: 'NoneType' object is not subscriptable` and for the second it just does not print out anything – Mark W Jan 15 '18 at 22:18
@Jonation McMurray here is the link if you want to see the all of the html https://dog.ceo/dog-api/breeds-image-random.php – Mark W Jan 15 '18 at 22:19
It looks like this page has no img in the HTML, it gets added by some embedded Javascript - so if you're downloading this page e.g. with `responses` module, the image will not be added as the JS isn't executed. This question may help with that: https://stackoverflow.com/questions/8049520/web-scraping-javascript-page-with-python – Jonathon McMurray Jan 15 '18 at 22:28
2

@MarkW yes, that's AFTER the JavaScript has executed - the JS inserts the `img` tag, without executing the JS source, there is no img tag. You can see this if you use 'View Source' in your browser instead of 'Inspect' – Jonathon McMurray Jan 15 '18 at 22:36

score 0 · Answer 2 · answered Jan 16 '18 at 00:19

Looks like you might be better off hitting the API that this page calls to get its image:

In [13]: r = requests.get('https://dog.ceo/api/breeds/image/random')

In [14]: r.json()
Out[14]:
{'message': 'https://dog.ceo/api/img/terrier-dandie/n02096437_1790.jpg',
 'status': 'success'}

How to print the inner contents of a div with BeautifulSoup?

2 Answers2