56

I'm looking for a way to use findAll to get two tags, in the order they appear on the page.

Currently I have:

import requests
import BeautifulSoup

def get_soup(url):
    request = requests.get(url)
    page = request.text
    soup = BeautifulSoup(page)
    get_tags = soup.findAll('hr' and 'strong')
    for each in get_tags:
        print each

If I use that on a page with only 'em' or 'strong' in it then it will get me all of those tags, if I use on one with both it will get 'strong' tags.

Is there a way to do this? My main concern is preserving the order in which the tags are found.

DasSnipez
  • 1,644
  • 3
  • 17
  • 29

2 Answers2

105

You could pass a list, to find any of the given tags:

tags = soup.find_all(['hr', 'strong'])
jfs
  • 346,887
  • 152
  • 868
  • 1,518
  • 2
    I think soup.findAll(['hr', 'strong']) could do the job, find_all does not run. – oscarmlage Oct 31 '14 at 12:09
  • 7
    @r0sk: `find_all()` is the correct name on beautifulsoup4. Click the link in the answer. `findAll()` is for BeautifulSoup 3 that is replaced by Beautiful Soup 4. – jfs Dec 01 '14 at 17:48
9

Use regular expressions:

import re
get_tags = soup.findAll(re.compile(r'(hr|strong)'))

The expression r'(hr|strong)' will find either hr tags or strong tags.

TerryA
  • 52,957
  • 10
  • 101
  • 125