jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rakesh Vidyadharan <rak...@sptci.com>
Subject Re: Batch Import from mysql?
Date Wed, 29 Sep 2010 16:31:47 GMT
Using the workspace importXML method, I need about 2.5G of heap to restore from a 140Mb system
export file (runs in a few minutes).  Importing the same data directly from MS Sql Server
takes about 30 minutes but runs comfortably with 512Mb heap (may run with less, but I have
not tested).  Note that  I am performing a session.save literally after each node was created
(about 25K in all), so the time for the import could probably be cut down quite significantly
by doing batch save.

Rakesh

On 29 Sep 2010, at 10:35, Justin Edelson wrote:

> I think it depends upon how you do it.
> 
> Using the import methods on the Workspace is workspace write, so it
> shouldn't consume much memory. The import methods on the Session could
> consume a significant amount of heap, depending upon the size of your
> import. However, Session.importXML / Session.getImportContentHandler
> should not consume significantly more memory than you would consume by
> creating those same nodes manually.
> 
> But yes, node structure is important.
> 
> Justin
> 
> On 9/29/10 10:10 AM, Rakesh Vidyadharan wrote:
>> In my experience importing from the system view XML requires huge amounts of memory.
 If the process requires hours to import from MySQL, I would think that heap space constraints
would limit the import from XML option.
>> 
>> To the OP, make sure that your node structure is partitioned.  If not, write performance
will continue to degrade leading to very poor import performance.
>> 
>> Rakesh
>> 
>> On 28 Sep 2010, at 20:18, Justin Edelson wrote:
>> 
>>> You can generate a system view XML file from your MySQL data and then
>>> import that in one shot.
>>> 
>>> Justin
>>> 
>>> On 9/28/10 9:14 PM, sam lee wrote:
>>>> Hey,
>>>> 
>>>> I need to migrate data stored in mysql to jackrabbit.
>>>> I'm using JCR API  (Node.addNode().. etc). And it takes many  hours to do
>>>> so.
>>>> 
>>>> Is there a way to speed things up?
>>>> 
>>>> I was wondering if there is a way to directly output JCR files and copy
>>>> those to jackrabbit repo, instead of using API.
>>>> 
>>>> Thanks.
>>>> Sam
>>>> 
>>> 
>> 
>> Rakesh Vidyadharan
>> President & CEO
>> Sans Pareil Technologies, Inc.
>> http://www.sptci.com/
>> 
>> 
>> | 100 W. Chestnut, Suite 1305 | Chicago, IL 60610-3296 USA |
>> | Ph: +1 (312) 212-3933 | Mobile: +1 (312) 315-1596 | Fax: +1 (312) 276-4410 | E-mail:
rakesh@sptci.com
>> 
>> 
>> 
> 

Rakesh Vidyadharan
President & CEO
Sans Pareil Technologies, Inc.
http://www.sptci.com/


| 100 W. Chestnut, Suite 1305 | Chicago, IL 60610-3296 USA |
| Ph: +1 (312) 212-3933 | Mobile: +1 (312) 315-1596 | Fax: +1 (312) 276-4410 | E-mail: rakesh@sptci.com



Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message