jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anton Bachevsky <...@ciklum.com>
Subject Re: AW: AW: Jackrabbit indexing in a separate thread
Date Fri, 24 Feb 2012 09:38:26 GMT
Hi Claus,

I think we have found the reason of low performance. We created the 
followoing stress test:
1. Uploading of 10 identical PDF files with different names in 10 
threads. The size of PDF file is 100 MB
2. Deleting PDF files
3. Repeat steps 1 and 2 in a infinite loop

Analysis of dump thread showed that Jackrabbit intensively "merges" PDF 
files with each other during each operation of upload or save to 
repository. As far as we understand, Jackrabbit merges files even with 
different names but similar binary content in order to save disk space. 
We think it saves the original PDF file and difference (some delta) for 
second similar PDF file. I can be wrong but this is our feeling.
Moreover, when file is being deleted, Jackrabbit does not delete it 
physically but only marks as 'deleted'. The real delete operation will 
be performed by Jackrabbit Garbage Collector. So the situation could be 
the following:
1. Test uploads PDF files.
2. Test deletes PDF files.
3. Test uploads PDF files in second loop.
4. Jackrabbit merges PDF files with already "deleted" ones.

When we tried to perform the same stress test with small files (100 Kb), 
the performance was much better, because there was no intensive merging.

Is there a way to "tune" merging or even switch it off, even if we loose 
in saving disk space?

> Hi Anton,
>> In our case we have 400 Gb repository, the average simultaneous amount
>> of people using Jackrabbit is 25. And the configuration of server is the
> I think with that configuration you should handle 25 users in any case :-)
>> Do you think it is enough, if not maybe that is one of possible reasons
>> why sometimes the application does not respond for long periods of time?
> Hmm it's extremely hard to say what's going on in your application ...
> If it hangs you could create a thread dump to analyse where your applicaion is waiting
> greets
> claus

View raw message