Instead of using a regular expression, you can use an XPath expression to retrieve all value attributes for the HTML you have in the question:
//select[@name="quantity"]/option[@selected="selected"]/@value
In words:
- Find all
<select>
elements within the XML with attribute name
equal to quantity
, with a subelement <option>
with an attribute selected
equal to selected
- Retrieve the
value
attributes.
I would really consider trying with an XQuery/XPath, that's what it is made for. Read this answer to the question How to read XML using XPath in Java on how to retrieve the values. An introduction on XPath expressions here.
Consider the situation where in the future you then need to only find options where attribute selected="selected"
and eg status="accepted"
. The XPath expression would simply become:
//select[@name="quantity"]/option[@selected="selected" and @status="accepted"]/@value
The XPath expression is easy to extend, easy to review, easy to prove what it is doing.
Now what kind of RegEx monster would you have to create for the added condition? Hard to write, even harder to maintain. How can a code-reviewer tell what the complex (cf bobble bubble's answer) regular expression is doing? How do you prove that the regular expression is actually doing what it is supposed to do?
You can of course document the regular expression, something you should always do for regular expressions. But that doesn't prove anything.
My advice: Stay away from regular expressions unless there is absolutely no other way.
For sports I made a snippet showing the basics of this way of working:
import java.io.StringReader;
import javax.xml.xpath.*;
import org.w3c.dom.NodeList;
import org.xml.sax.InputSource;
public class ReadElementsFromHtmlUsingXPath {
private static final String html=
"<html>Read more about XPath <a href=\"www.w3schools.com/xsl/xpath_intro.asp\">here</a>..."+
"<select attr=\"other stuff\" name=\"quantity\">"+
"<option value=\"1\" />"+
"<option value=\"2\" selected=\"selected\" />"+
"</select>"+
"<i><b>Oh and here's the second element</b></i>"+
"<select name=\"quantity\" attr=\"other stuff\">"+
"<option selected=\"selected\" value=\"5\" />"+
"<option value=\"6\" />"+
"</select>"+
"And that's all folks</html>";
private static final String xpathExpr =
"//select[@name=\"quantity\"]/option[@selected=\"selected\"]/@value";
public static void main(String[] args) {
try {
XPath xpath = XPathFactory.newInstance().newXPath();
XPathExpression expr = xpath.compile(xpathExpr);
NodeList nodeList = (NodeList) expr.evaluate(new InputSource(new StringReader(html)),XPathConstants.NODESET);
for( int i = 0; i != nodeList.getLength(); ++i )
System.out.println(nodeList.item(i).getNodeValue());
} catch (XPathExpressionException e) {
e.printStackTrace();
}
}
}
Result in output:
2
5