xml-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Cohen <Ste...@ignitemedia.com>
Subject RE: More on problem of transform that is taking too long.
Date Thu, 02 Nov 2000 14:59:43 GMT
I have solved the problem.  The problem was a variable that was being
instantiated this way:
<xsl:variable name="months"
	
select="game[not(year_and_month=preceding-sibling::game/year_and_month)]/yea
r_and_month" 
/>

Instantiating it thus solves the problem:
<xsl:variable name="months"
	
select="game[not(year_and_month=preceding-sibling::game[1]/year_and_month)]/
year_and_month" 
/>

By the way, this construction was from the book XSLT Programmers' Reference
by Michael Kay, which
I recommend heartlily.  (If only I'd read it carefully the first time!)

I have now learned that the former construction checks ALL preceding
siblings while the latter checks
only the immediately preceding sibling.  Since my data is sorted, that works
for me.

Making this change allows me to process the entire schedule in 30 seconds
(acceptable) as opposed to
at least an hour and a half.

Thanks for all who helped me.

Wish list item:
A more intuitive way to handle grouping.  This trick worked here, but I
still had problems when I had three level grouping i.e. Group by month then
by day.  Now big delays were back again.

The solution I came up with (which wouldn't have worked if I hadn't had full
control of the incoming data) was first of all to have explicit elements for
year-month (e.g. "2000-11") as well as for date ("2000-11-02") in the input.
I first navigate through all the input nodes and pull distinct months and
distinct dates into variables.  
 
<xsl:variable name="dates" 
	
select="game[not(date_of_game=preceding-sibling::game[1]/date_of_game)]/date
_of_game" />

<xsl:variable name="months" 
	
select="game[not(year_and_month=preceding-sibling::game[1]/year_and_month)]/
year_and_month" />

I then walk these variables instead of the input nodes:

<xsl:for-each select="$months">
      <xsl:variable name="moyr" select="."/>

	<!-- this is the crucial step - you must take a subselect of the
input nodes here, otherwise
           you will scan the entire list at every subgroup, for a huge
performance drain -->

      <xsl:variable name="thismonthsgames"
select="//game[year_and_month=current()]"/>
					
      <xsl:element name="monthly" >
      	<xsl:attribute .../>
      	<xsl:attribute .../>
		<xsl:for-each select="$dates">
			<xsl:variable name="thisdate" select="."/>
			<xsl:if test="substring($thisdate,1,7)=$moyr">
			<xsl:element name="daily">
				<xsl:attribute ... />
				<xsl:attribute ... />
				<xsl:for-each
select="$thismonthsgames[date_of_game=current()]">
				<xsl:element name="game">
					<!-- emit data here -->
				</xsl:element>
			</xsl:element>
		</xsl:for-each>
	</xsl:element>
</xsl:for-each>


Obviously, a simpler means of doing this would be a godsend.

----------------------------------------------
Steve Cohen
Sr. Software Engineer
Ignite Sports Media, LLC
stevec@ignitemedia.com

-----Original Message-----
From: Scott_Boag@lotus.com [mailto:Scott_Boag@lotus.com]
Sent: Wednesday, November 01, 2000 8:28 PM
To: Steve Cohen
Cc: general@xml.apache.org
Subject: Re: More on problem of transform that is taking too long.



Steve Cohen <SteveC@ignitemedia.com> wrote:
> This is the line that appears to be the performance-killer:
>
>         <xsl:variable name="months"
> select
="game[not(year_and_month=preceding-sibling::game/year_and_month)]/yea
> r_and_month" />

Steve, backwards traversal in XalanJ1's DTM is not very good.  It was a
design comprimise we made which is biting you.  In addition, in Xalan1,
this particular expression (preceding-sibling::game/year_and_month) has to
go backwards all the way, and search each child list, in order to discover
the non-existence.  In XalanJ1, it will search all the way no matter what,
in XalanJ2 it can stop if it finds one.  And it has to do it for each
"game" element, so the results will be rather nasty in terms of performance
(n-squared at least).

It would be interesting for you to try and run the same transformation with
the Xalan2 alpha.  I'm not sure what the results will be, they might be
better or worse.  XalanJ2 is much better at backwards traversal, but the
preceding-sibling axes isn't necessarily optimized yet.

I can't think of a way to re-code this off the top of my head, though
someone else may have a good idea.

If you want to pass me the stylesheet and xml, I would be happy to do some
performance analysis in XalanJ2 next week (this week is booked), and try
and make this fast.  It's an interesting case.  I would rather not look for
optimization solutions in XalanJ1 at this point.

-scott


---------------------------------------------------------------------
In case of troubles, e-mail:     webmaster@xml.apache.org
To unsubscribe, e-mail:          general-unsubscribe@xml.apache.org
For additional commands, e-mail: general-help@xml.apache.org

Mime
View raw message