jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marcel Bruch <marcel.br...@gmail.com>
Subject Re: Using Jackrabbit/JCR as IDE workspace data backend
Date Mon, 26 Sep 2011 17:56:20 GMT
Hi Stefan,

On 26.09.2011, at 18:13, Stefan Guggisberg wrote:

>> I wrote a fairly ad-hoc dump of the 5900 data files into Jackrabbit.
>> Storing ~240 MB took roughly 3 minutes. Is this the expected time such
>> an operation takes? Is it possible to improve the performance somehow?
> 
> the performance seems rather poor. it's hard to tell what's wrong
> without having the test data. i noticed that you're storing the
> content of the .json files as string properties. why aren't you
> storing the json data as nodes & properties?

I had no code available for serializing the data as JCR nodes. Is there any simple snippet
available somewhere?
However, I thought as a first baseline this would work. 


> anyway, i quickly ran an adapted ad hoc test on my machine
> (macbook pro 2.66 ghz, standard harddisk). the test imports
> an 'svn export' of jackrabbit/trunk.
> 
> importing ~6500 files takes ~30s which is IMO decent.

Thanks for writing your test agains your local files!

I run your code and compared the execution times. Unfortunately, it's not performing  faster
:( 
The minute delta might be cause by some file traversing differences of by the additional nodes/properties
created in your code.

However, the overall performance is still a bit low (2:24-3:05 minutes in a clean repository).
Any idea how the performance could be improved? Am I doing something conceptually wrong?
I'm assuming that there is no big delta between creating hundreds of nodes and properties
compared to dumping a file's content into Jackrabbit. Is this correct?

Thanks,
Marcel

=== Experiments performance results ===


Jackrabbit First Hops code adapted:

0:00:08.522: 500 units persisted.  data 17 MB 
0:00:17.057: 1000 units persisted.  data 33 MB 
0:00:31.763: 1500 units persisted.  data 53 MB 
0:00:41.404: 2000 units persisted.  data 72 MB 
0:00:53.140: 2500 units persisted.  data 97 MB 
0:01:02.988: 3000 units persisted.  data 113 MB 
0:01:16.314: 3500 units persisted.  data 133 MB 
0:01:35.171: 4000 units persisted.  data 143 MB 
0:01:49.414: 4500 units persisted.  data 173 MB 
0:02:04.617: 5000 units persisted.  data 204 MB 
0:02:12.593: 5500 units persisted.  data 221 MB 
Mon Sep 26 19:54:58 CEST 2011: 5927 units persisted
Run took 0:02:24.505


Mailing List proposal:

0:00:14.853: 500 units persisted. data  17 MB
0:00:26.353: 1000 units persisted. data  33 MB
0:00:36.114: 1500 units persisted. data  53 MB
0:00:53.274: 2000 units persisted. data  72 MB
0:01:06.643: 2500 units persisted. data  97 MB
0:01:18.230: 3000 units persisted. data  113 MB
0:01:36.765: 3500 units persisted. data  133 MB
0:01:44.245: 4000 units persisted. data  143 MB
0:02:04.026: 4500 units persisted. data  173 MB
0:02:37.533: 5000 units persisted. data  204 MB
0:02:48.089: 5500 units persisted. data  221 MB
Run took 0:03:08.458



Mime
View raw message