web analytics

How to Use Recursion Effectively in XSLT?

Options

codeling 1595 - 6639
@2016-02-02 12:41:57

Two common scenarios for using recursion in XSLT are:

  • When you have a set of repeating values in the source XML and you want the transformation result to reflect something about all of those values. For example, if you have a catalog of items in XML and want to present those items along with the price for all of the items, you would have to find that total price using a recursive template.
  • When the source XML contains a number x in a tag, for example <countTo number="5"/>, and you want to present some information that same x number of times in the transformation output.

Suppose you have a product catalog:

<Products>
  <Product>
    <Name>Gadget</Name>
    <Price>$10.00</Price>
  </Product>
  <Product>
    <Name>Gizmo</Name>
    <Price>$7.50</Price>
  </Product>
  ...
 </Products>

The goal is to add the first price in the list of products to the sum of all of the other prices until you get to the last price. You can do this with the XSL shown below:

<xsl:template name="priceSum">
  <xsl:param name="productList"/>
    <xsl:choose>
      <xsl:when test="$productList">
        <xsl:variable name="recursive_result">
          <xsl:call-template name="priceSum">
            <xsl:with-param name="productList"
              select="$productList[position() > 1]"/>
          </xsl:call-template>
        </xsl:variable>
        <xsl:value-of
          select="number(substring-after($productList[1]/Price,'$'))
                  + $recursive_result"/>
      </xsl:when>
      <xsl:otherwise><xsl:value-of select="0"/></xsl:otherwise>
  </xsl:choose>
</xsl:template>

In this solution you are calling the function on successively smaller lists of products, each time adding the first item in the list to the recursive call on the rest of the list. Once the list is empty, the termination condition is reached and you add zero to the result, which leaves the sum unchanged. This technique of working on node lists of smaller and smaller size can be useful in many settings.

The use of recursion in many XSLT engines has one major drawback when working with large datasets. It is all too easy to get a Stack Overflow or Out of Memory exception with recursive functions that reach a depth on the order of 10,000. If you are trying to sum prices on a catalog with more than 10,000 items, you may run into trouble.

Fortunately, there are ways to optimize recursive functions to reduce or even eliminate the depth of recursion incurred in a given template. The following postings introduce four methods to optimize how you write your recursive solutions in XSLT.

@2016-02-02 12:42:49

Recursion Optimization 1: Divide and Conquer

The divide and conquer approach is to divide the list into multiple smaller parts and do so recursively until you had divided each part down into a single product. The simplest way to envision this would be by dividing in half the list of products passed in to each recursive call, and then making separate recursive calls on the two resulting lists. This would allow all of the work to complete on one half of the list before any work could be done on the other half.

<xsl:template name="priceSumDivideAndConquer">
  <xsl:param name="productList"/>
    <xsl:choose>
      <xsl:when test="count($productList) = 1">
        <xsl:value-of
          select="number(substring-after($productList[1]/Price,'$'))"/>
      </xsl:when>
      <xsl:when test="$productList">
        <xsl:variable name="halfIndex"
          select="floor(count($productList) div 2)"/>
        <xsl:variable name="recursive_result1">
          <xsl:call-template name="priceSumDivideAndConquer">
            <xsl:with-param name="productList"
              select="$productList[position() <= $halfIndex]"/>
          </xsl:call-template>
        </xsl:variable>
        <xsl:variable name="recursive_result2">
          <xsl:call-template name="priceSumDivideAndConquer">
            <xsl:with-param name="productList"
              select="$productList[position() > $halfIndex]"/>
          </xsl:call-template>
        </xsl:variable>
      <xsl:value-of select="$recursive_result1 + $recursive_result2"/>
    </xsl:when>
    <xsl:otherwise><xsl:value-of select="0"/></xsl:otherwise>
  </xsl:choose>
</xsl:template>

The addition operations take place each time a halved list reaches length one (this being the termination condition) and occur before any processing happens on the second half of the list. Through this technique, the recursion depth never exceeds the log2 of the number of items in the list, which of course increases exponentially the possible number of product prices to count.

@2016-02-02 20:26:14

Recursion Optimization 2: Segmentation

Despite the incredible savings in recursion depth provided by the Divide and Conquer approach to recursion, it does have its drawbacks. For lists whose length is not an exact power of 2, the method introduces unnecessary overhead at the tails of the recursion tree.

An alternate approach that still reduces the recursion depth works by iterating across segments of nodes of a predefined length rather than dividing the nodes in half each time. This approach builds on the basic recursion technique, but introduces into it an extra template that acts as a segmentation manager.

The role of the segmentation manager is to break the large problem down into smaller problems that can be handled without concern for running out of memory. The manager keeps track of the results of these smaller tasks, and once they are all completed performs the necessary operation on those results.

<xsl:template name="priceSumSegmented">
  <xsl:param name="productList"/>
  <xsl:param name="segmentLength" select="5"/>
  <xsl:choose>
    <xsl:when test="count($productList) > 0">
      <xsl:variable name="recursive_result1">
        <xsl:call-template name="priceSum">
          <xsl:with-param name="productList"
            select="$productList[position() <= $segmentLength]"/>
        </xsl:call-template>
      </xsl:variable>
      <xsl:variable name="recursive_result2">
        <xsl:call-template name="priceSumSegmented">
          <xsl:with-param name="productList"
            select="$productList[position() > $segmentLength]"/>
        </xsl:call-template>
      </xsl:variable>
      <xsl:value-of select="$recursive_result1 + $recursive_result2"/>
    </xsl:when>
    <xsl:otherwise><xsl:value-of select="0"/></xsl:otherwise>
  </xsl:choose> 
</xsl:template>

Using a Segmentation approach rather than Divide and Conquer reduces overhead in most cases, but does not offer nearly the savings in recursion depth. Which technique is more appropriate for a given problem varies considerably depending on the problem faced. Which one should you use? The answer, as you will see in the next optimization, is both.

@2016-02-02 20:38:22

Recursion Optimization 3: Combining Divide and Conquer with Segmentation

You can combine the techniques of Divide and Conquer and Segmentation to:

  • Maximize the savings in reduction of recursion depth (including the processing overhead associated with each addition level of recursion used)
  • Avoid creating overhead at the leaf level of the recursion tree

In this approach, a threshold level is passed to the recursive template along with the node list (or other data to be worked on. Here, the Divide and Conquer method is modified to act as the segmentation manager. If the size of the list passed in is greater than the threshold level, the list is split in half and the template is recursively called on both halves. Otherwise, you use the basic recursion template to compute the result, just as in Segmentation.

The following code demonstrates this technique in the following XSL template:

<xsl:template name="priceSumCombination">
  <xsl:param name="productList"/>
  <xsl:param name="threshold" select="5"/>
  <xsl:choose>
    <xsl:when test="count($productList) > $threshold">
      <xsl:variable name="halfIndex"
        select="floor(count($productList) div 2)"/>
      <xsl:variable name="recursive_result1">
        <xsl:call-template name="priceSumCombination">
          <xsl:with-param name="productList"
            select="$productList[position() <= $halfIndex]"/>
        </xsl:call-template>
      </xsl:variable>
      <xsl:variable name="recursive_result2">
        <xsl:call-template name="priceSumCombination">
          <xsl:with-param name="productList"
            select="$productList[position() > $halfIndex]"/>
        </xsl:call-template>
      </xsl:variable>
      <xsl:value-of select="$recursive_result1 + $recursive_result2"/>
    </xsl:when>
    <xsl:otherwise>
      <xsl:call-template name="priceSum">
        <xsl:with-param name="productList" select="$productList"/>
      </xsl:call-template>
    </xsl:otherwise>
  </xsl:choose>
</xsl:template>

Comments

You must Sign In to comment on this topic.


© 2024 Digcode.com