Return-Path: Delivered-To: apmail-cocoon-users-archive@www.apache.org Received: (qmail 38951 invoked from network); 19 Nov 2004 10:42:38 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur-2.apache.org with SMTP; 19 Nov 2004 10:42:38 -0000 Received: (qmail 89091 invoked by uid 500); 19 Nov 2004 10:42:30 -0000 Delivered-To: apmail-cocoon-users-archive@cocoon.apache.org Received: (qmail 89052 invoked by uid 500); 19 Nov 2004 10:42:30 -0000 Mailing-List: contact users-help@cocoon.apache.org; run by ezmlm Precedence: bulk list-help: list-unsubscribe: list-post: Reply-To: users@cocoon.apache.org Delivered-To: mailing list users@cocoon.apache.org Received: (qmail 89038 invoked by uid 99); 19 Nov 2004 10:42:30 -0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=FORGED_RCVD_HELO X-Spam-Check-By: apache.org Received-SPF: neutral (hermes.apache.org: local policy) Received: from [66.111.4.26] (HELO out2.smtp.messagingengine.com) (66.111.4.26) by apache.org (qpsmtpd/0.28) with ESMTP; Fri, 19 Nov 2004 02:42:26 -0800 Received: from frontend2.messagingengine.com (frontend2.internal [10.202.2.151]) by frontend1.messagingengine.com (Postfix) with ESMTP id 34C3BC39C78 for ; Fri, 19 Nov 2004 05:42:19 -0500 (EST) X-Sasl-enc: FhHfeXJwuBsQIT44e3BqHw 1100860937 Received: from [192.168.1.74] (unknown [213.48.13.39]) by www.fastmail.fm (Postfix) with ESMTP id 27890570125 for ; Fri, 19 Nov 2004 05:42:16 -0500 (EST) Message-ID: <419DCD65.9090004@upaya.co.uk> Date: Fri, 19 Nov 2004 10:39:33 +0000 From: Upayavira User-Agent: Mozilla Thunderbird 0.7 (Windows/20040616) X-Accept-Language: en-us, en MIME-Version: 1.0 To: users@cocoon.apache.org Subject: Re: Large XML transformations in Cocoon. References: In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N Derek Hohls wrote: >OK, I'll bite here, as my curiosity is aroused (and, lets face it, >as XML gets wider use, its likely that file sizes will get larger) > >You say "XSLT isn't that appropriate for that sort of thing"; >I thought XSLT was *the* preferred way for processing XML?! > > Because XSLT can, in various circumstances, build in memory versions of our XML. If your XML is large, you will consume a lot of memory, which could break things. STX (which I have never used), is intended to be a streaming process, which means that you don't hold any of your XML in memory, and you can thus stream as much XML as your recipient can handle. >Second; what are the advantages/disadvantagesof STX icw XSLT? > > Advantage of STX? It is streamed. Disadvantage of STX? It is streamed. As I say, I've never used it, but I suspect there are some things that require 'knowledge' of different parts of the XML structure that could be difficult to implement in STX. But then, I might be wrong. Regards, Upayavira >>>>uv@upaya.co.uk 2004/11/19 09:09:18 AM >>> >>>> >>>> >Tom Bloomfield wrote: > > > >>I'm planning to do xml -> text transformations (for tab-delimited >>output) and xml -> FOP on large XML datasets. The XML I will be >>processing will be 10-12 MB in size, and will grow from there. Based >> >> > > > >>on planning, the XSL will contain around 50 node traversals and will >> >> > > > >>iterate over my XML dataset around 46,000 times. Previous to this, >> >> >my > > >>Cocoon transformations haven't been nearly this big. >> >>The amount of JVM memory I have to deal with is limited (<256M). >> >> >This > > >>transformation will need to run in real-time. >>Does anyone have experience dealing with large datasets like this? >> >> > >That sounds like quite a challenge. XSLT isn't that appropriate for >that >sort of thing. Firstly, in XSLT, avoid arbitrary wanders around your >XML >tree - stay as close to the context node as you can. > >Alternatively, look at STX (there is an STX block). See if you can >manage your transformations with that. This is "streaming" >transformations for XML, i.e. it is designed for streaming, and thus >should be able to handle large datasets. > >Regards, Upayavira > > >--------------------------------------------------------------------- >To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org >For additional commands, e-mail: users-help@cocoon.apache.org > > > > --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org For additional commands, e-mail: users-help@cocoon.apache.org