4

I need an information to optimize my xslt.

In my template I access a child multiple times like for example:

<xsl:template match="user">
 <h1><xsl:value-of select="address/country"/></h1>
 <p><xsl:value-of select="address/country"/></p>
 <p><xsl:value-of select="address/country"/></p>
  ... more and more...
 <p><xsl:value-of select="address/country"/></p>
</xsl:template>

Would it be better to store the content of the child element in a variable and directly call the variable to avoid to parse the tree everytime:

<xsl:template match="user">
 <xsl:variable name="country" select="address/country"/>
 <h1><xsl:value-of select="$country"/></h1>
 <p><xsl:value-of select="$country"/></p>
 <p><xsl:value-of select="$country"/></p>
  ... more and more...
 <p><xsl:value-of select="$country"/></p>
</xsl:template>

Or will the use of a variable consume more resources than parsing the tree multiple times?

Mathias Müller
  • 20,222
  • 13
  • 53
  • 68
ylerjen
  • 3,622
  • 1
  • 20
  • 36

3 Answers3

4

Usually, an XML file is parsed as a whole and held in memory as XDM. So, I guess that by

than parsing the tree multiple times

you actually meant accessing the internal representation of the XML input multiple times. The figure below illustrates this, we are talking about the source tree:

enter image description here
(taken from Michael Kay's XSLT 2.0 and XPath 2.0 Programmer's Reference, page 43)

Likewise, xsl:variable creates a node (or, more precisely, a temporary document) that is held in memory and that needs to be accessed, too.

Now, what exactly do you mean by optimisation? Do you mean the time it takes to perform the transformation or CPU and memory usage (as you mention "resources" in your question)?

Also, performance depends on the implementation of your XSLT processor of course. The only reliable way of finding out is to actually test this.

Write two stylesheets that differ only in this regard, that is, are identical otherwise. Then, let both of them transform the same input XML and measure the time they take.

My guess is that accessing a variable is faster and it is also more convenient to repeat a variable name than repeating full paths as you write code (this is sometimes called "convenience variables").


EDIT: Replaced with something more appropriate, as a response to your comment.

If you actually test this, write two stylesheets:

Stylesheet with variable

<?xml version="1.0" encoding="utf-8"?>

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

   <xsl:output method="xml" indent="yes"/>

   <xsl:template match="/root">
      <xsl:copy>
         <xsl:variable name="var" select="node/subnode"/>
         <subnode nr="1">
            <xsl:value-of select="$var"/>
         </subnode>
         <subnode nr="2">
            <xsl:value-of select="$var"/>
         </subnode>
      </xsl:copy>
   </xsl:template>

</xsl:stylesheet>

Stylesheet without variable

<?xml version="1.0" encoding="utf-8"?>

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

   <xsl:output method="xml" indent="yes"/>

   <xsl:template match="/root">
      <xsl:copy>
         <subnode nr="1">
            <xsl:value-of select="node/subnode"/>
         </subnode>
         <subnode nr="2">
            <xsl:value-of select="node/subnode"/>
         </subnode>
      </xsl:copy>
   </xsl:template>

</xsl:stylesheet>

Applied to the following input XML:

<root>
   <node>
      <subnode>helloworld</subnode>
   </node>
</root>

EDIT: As suggested by @Michael Kay, I measured the average time taken in 100 runs ("-t and -repeat:100 on the Saxon command line"):

with variable: 9 ms
without variable: 9 ms

This does not imply that the result is the same with your XSLT processor.

Mathias Müller
  • 20,222
  • 13
  • 53
  • 68
  • Thx for your reply. By "Optimisation" I meant both : the time to perform the transformation and the memory usage. At first I wanted to remove the variables to avoid storing content that I can reach easily. But then I was asking me if removing the variables would affect the time needed for the transformation (if yes, I wont do that). Concerning the 2nd part of your answer : This case is a fictive example to explain my problem. In reality my template is much more complex than that. The "country" node is used mixed with other elements and there's no benefits to put it in a specific template – ylerjen Feb 20 '14 at 12:35
  • You're welcome. I assumed this is only a sample: "I take it that this is a sample XSLT template". Still, you'll only know for sure once you actually measured the performance. – Mathias Müller Feb 20 '14 at 12:40
  • @Miam84 I have edited my question to make it more informative. Please have a look. – Mathias Müller Feb 20 '14 at 12:51
  • 1
    Measuring time at the command line is useless; it only measures how long it takes to start the Java VM. Try using -t and -repeat:100 on the Saxon command line. – Michael Kay Feb 20 '14 at 15:17
  • I was not aware of this. Thanks for pointing this out. I have corrected my answer. – Mathias Müller Feb 20 '14 at 15:26
2

For all performance questions, the answer is: it depends.

  • It depends what XSLT processor you are using, and on the optimizations it performs.

  • It's very likely to depend on how many children have to be searched to find the ones you are looking for.

The only way to find out is to measure it, and to measure it very carefully.

Personally, I would use a variable if there is a complex predicate involved, but not if I'm just looking for children by name.

In nearly all cases, even if it makes a difference, it is very unlikely to make a difference to the bottom line of your business. If you are interested in improving the bottom line of your business, there are probably better ways to employ your intellect.

Michael Kay
  • 138,236
  • 10
  • 76
  • 143
1

Edit: Having been invited to re-evaluate my answer, I learned that your own suggestion is probably quite suitable for what you are going for. Unless you encapsulate a variable's selection value in additional single quotes [to make it a string constant], it will contain the selected element. [Instead of inserting said element's text contents, you can even copy the selected element's entire sub-tree by using <xsl:copy-of select="$country"/> if you desire so.]

For even less repetitive source, why not applying an own template for the element in question:

<xsl:apply-template select="address/country"/>
[...]
<xsl:template match="address/country">
   <h1><xsl:value-of select="."/></h1>
   <p><xsl:value-of select="."/></p>
   [...]
</xsl:template>

Like @Mathias_Müller suggested, there are also ways to express your '...more and more...' behaviour without having to copy'n'paste concerned lines over and over. XSLT 2.0 interprets numerical ranges in the for-each statement:

<xsl:for-each select="1 to 100">
  <p><xsl:value-of select="."/></p>
</xsl:for-each>

If XSLT is not available in a version >= 2.0, a slightly more complex solution is to conditionally call templates explicitly using call-template while passing parameters and implementing a divide-and-conquer approach [to protect the stack]:

<xsl:call-template name="ntimes">
  <xsl:with-param name="counter" select="100"/>
</xsl:call-template>
[...]
<xsl:template name="ntimes">
  <xsl:param name="counter" select="0"/>
  <xsl:if test="$counter > 0">
    <xsl:choose>
      <xsl:when test="$counter = 1">
        <xsl:apply-template select="address/country"/>
      </xsl:when>
      <xsl:otherwise>
        <xsl:variable name="half" select="floor($counter div 2)"/>
        <xsl:call-template name="ntimes">
          <xsl:with-param name="counter" select="$half"/>
        </xsl:call-template>
        <xsl:call-template name="ntimes">
          <xsl:with-param name="counter" select="$counter - $half"/>
        </xsl:call-template>
       [...]

Go here and here for explanation.

To be honest, I know nothing about performance and optimization in XSLT. I never considered it worth the effort, given that most of the time I use XSLT processors written in Java, and of what use is it to have great input files, while theres is still an entire, several hundred MB of RAM consuming JVM to start up..?

Community
  • 1
  • 1
J. Katzwinkel
  • 1,859
  • 15
  • 22
  • 1
    Not really, it _is_ the content of `address/country` that is stored as the content of the variable. Having `` inside `xsl:variable` is just more code to the same end. – Mathias Müller Feb 20 '14 at 10:49