Return-Path: Delivered-To: apmail-jackrabbit-users-archive@locus.apache.org Received: (qmail 13970 invoked from network); 28 Apr 2006 14:56:53 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 28 Apr 2006 14:56:53 -0000 Received: (qmail 37625 invoked by uid 500); 28 Apr 2006 14:56:44 -0000 Delivered-To: apmail-jackrabbit-users-archive@jackrabbit.apache.org Received: (qmail 37601 invoked by uid 500); 28 Apr 2006 14:56:43 -0000 Mailing-List: contact users-help@jackrabbit.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@jackrabbit.apache.org Delivered-To: mailing list users@jackrabbit.apache.org Received: (qmail 37582 invoked by uid 99); 28 Apr 2006 14:56:43 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 28 Apr 2006 07:56:43 -0700 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received-SPF: neutral (asf.osuosl.org: local policy) Received: from [212.249.34.130] (HELO picanmix.dev.day.com) (212.249.34.130) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 28 Apr 2006 07:56:42 -0700 Received: from deliverix.day.com (deliverix.day.com [10.0.0.7]) by picanmix.dev.day.com (DAY) with ESMTP id k3SEuGr11498 for ; Fri, 28 Apr 2006 16:56:20 +0200 (MEST) Received: from [10.0.0.55] ([10.0.0.55]) by deliverix.day.com (Lotus Domino Release 6.5.1) with ESMTP id 2006042816561370-83801 ; Fri, 28 Apr 2006 16:56:13 +0200 Mime-Version: 1.0 (Apple Message framework v749.3) In-Reply-To: <34B0CDC6176518459F3A96E8C09196B80384F485@darth-vader.nijmegen.gx.nl> References: <34B0CDC6176518459F3A96E8C09196B80384F485@darth-vader.nijmegen.gx.nl> Message-Id: From: "Roy T. Fielding" Subject: Re: Workspace.importXML() and Memory Date: Fri, 28 Apr 2006 07:56:17 -0700 To: users@jackrabbit.apache.org X-Mailer: Apple Mail (2.749.3) X-MIMETrack: Itemize by SMTP Server on dmail/Day(Release 6.5.1|January 21, 2004) at 04/28/2006 16:56:13, Serialize by Router on dmail/Day(Release 6.5.1|January 21, 2004) at 04/28/2006 16:56:20, Serialize complete at 04/28/2006 16:56:20 X-TM-AS-Product-Ver: -<3.0.0.3227>-<3.52.1006>-<14412> X-TM-AS-Result: -<-4.200>-<4.5>-<99000> Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N On Apr 28, 2006, at 5:52 AM, Simon Edwards wrote: > I've been testing out Jackrabbit this week, mostly trying to find out > how scalable it is. One thing I tried doing was importing a a 45Mb XML > file containing about 300,000 XML nodes. > > Using Session.importXML() blew the JVM's heap up of course, as it > tried > to read everything before writing it to the DB. So I tried > Workspace.importXML() expecting it to write the nodes directly through > to the DB. Unfortunately it seems to act like Session.importXML() and > try to read everything in first before writing to the DB. Of course > this > also blew up the JVM (-X512m). > > Now, my question is; is this the correct behaviour for > Workspace.importXML() to cache everything first in memory before > writing? Nope, that is definitely a bug. Sounds like a good project. ....Roy