Finding attribute value using lxml without using a for loop

Question

This is the code I have at the moment:

>>>p = []
>>>r = root.findall('.//*div/[@class="countdown closed"]/')
>>>r
'<abbr data-utime="1383624000" class="timestamp"/>'
>>>for i in r:
            s = i.attrib
            p.append(s['data-utime'])
>>>p
['1383624000']

s yields:

{'class': 'timestamp', 'data-utime': '1383624000'}

I think the code above is verbose(creating a list, using a for loop for only 1 string).

I know lxml is capable of achieving this more succinctly however I am unable to achieve this, I appreciate any assistance.

Without a pointer to a document containing the content you're querying against, it's hard to test an answer for correctness / compatibility. — Charles Duffy, Aug 18 '14 at 16:13

Charles Duffy · Accepted Answer · 2014-08-18T16:15:39.300

Use XPath, not the ElementTree findall() (which is a more limited and restricted language present for compatibility with the ElementTree library lxml extends), and address your path all the way down to the attribute:

root.xpath('//html:div[@class="countdown closed"]/@data-utime',
  namespaces={'html': 'http://www.w3.org/1999/xhtml'})

(It is possible to use namespace wildcards in XPath, but not great practice -- not only does it leave one open to namespace collisions, but can also be a performance impediment if your engine indexes against fully-qualified attribute names).

score 1 · Answer 2 · edited May 23 '17 at 12:05

If you are expecting to find just one element, use .find(), not .findall():

r = root.find('.//*div/[@class="countdown closed"]/')
if r is not None:
    p.append(r['data-utime'])

element.find() returns None if no match is found, or the element. If you are certain that the element is always present, you can omit the if r is not None test.

Because you are using lxml, you can use the element.xpath() method to use a more powerful XPath expression that what mere ElementTree methods can support. You can add a /@attribute-name attribute selection element to the path to select the attribute value directly:

attr = root.xpath('.//*div[@class="countdown closed"]/@data-utime')
p.extend(attr)

.xpath() returns a list as well, but you can just use p.extend to add all contained values to p in one step.

Finding attribute value using lxml without using a for loop

2 Answers2