jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Florent Guillaume ...@nuxeo.com>
Subject Re: persistance
Date Thu, 13 Sep 2007 23:53:12 GMT
If you import that big a file, you should import directly into the 
workspace and not in the session, without going through the transient 
space and using lots of memory.
So use Workspace.getImportContentHandler or Workspace.importXML, not the 
Session methods. Read the JSR-170 for the benefits.


chewy_fruit_loop wrote:
> I'm currently trying to import an XML file into a bog standard empty
> repository.
> The problem is the file is 72.5mb containing around 200,000 elements (yes
> they are all required).  This is currently taking about 90 mins (give or
> take) to get into derby, and thats with indexing off.
> The time wouldn't be such an issue if it didn't use 1.7Gb of RAM.
> I've decorated a ContentHandler so it calls :
> root.update(<workspace name>)
> root.save()
> where root is the root node from the tree.
> This is being called after every 500 start elements.  The save just doesn't
> seem to flush the contents that have been parsed to the persistent store. 
> This is the same if I use derby or Oracle as storage.  The only time things
> seem to start to be persisted is when the endDocument is hit.
> have I missed something blindingly obvious here?  I really don't mind
> everyone having a bit of a chuckle at me, I just want to get this sorted
> out.
> thanks

Florent Guillaume, Director of R&D, Nuxeo
Open Source Enterprise Content Management (ECM)
http://www.nuxeo.com   http://www.nuxeo.org   +33 1 40 33 79 87

View raw message