jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Clay Ferguson <wcl...@gmail.com>
Subject Re: Memory usage
Date Tue, 24 Nov 2015 04:41:31 GMT
Kevin,
That word "generation" just means the most recent limited set of buffers.
Don't worry Lucene doesn't hold it's entire index in memory. I'm certain of
that. It is doing buffering using as minimal memory as possible just like
database engines, etc. As I said with my list of guesses... I say it's 99%
likely that your memory problem is not related to JCR or Lucene, but just a
leak you should be able to find.

Best regards,
Clay Ferguson
wclayf@gmail.com


On Mon, Nov 23, 2015 at 9:16 PM, Roll, Kevin <Kevin-Roll@idexx.com> wrote:

> Hi, Ben. I was referring to the following page:
>
> https://jackrabbit.apache.org/jcr/search-implementation.html
>
> "The most recent generation of the search index is held completely in
> memory."
>
> Perhaps I am misreading this, or perhaps it is wrong, but I interpreted
> that to mean that the size of the index in memory would be proportional to
> the repository size. I hope this is not true!
>
> I am currently trying to get information from our QA team about the
> approximate number of nodes in the repository. We are not currently setting
> an explicit heap size - in the dumps I've examined it seems to run out
> around 240Mb. I'm pushing to set something explicit but I'm now hearing
> that older hardware has only 1Gb of memory, which gives us practically
> nowhere to go.
>
> The queries that I'm doing are not very fancy... for example: "select *
> from [nt:resource] where [jcr:mimeType] like 'image%%'". I'm actually
> rewriting that task so the query will be even simpler.
>
> Thanks for the help!
>
>
> users@jackrabbit.apache.org
> -----Original Message-----
> From: Ben Frisoni [mailto:frisonib@gmail.com]
> Sent: Monday, November 23, 2015 5:21 PM
> To: users@jackrabbit.apache.org
> Subject: Re: Memory usage
>
> It is a good idea to turn off supportHighlighting especially if you aren't
> using the functionality. It takes up a lot of extra space within the index.
> I am not sure where you heard that the Lucene Index is kept in memory but I
> am pretty certain that is wrong. Can you point me to the documentation
> saying this?
>
> Also what data set sizes are you querying against (10k nodes ? 100k nodes?
> 1 mil nodes?).
> What heap size do you have set on the jvm?
> Reducing the resultFetchSize should help reduce the memory footprint on
> queries.
> I am assuming you are using the QueryManager to retrieve nodes. Can you
> give an example query that you are using?
>
> I have developed a patch to improve query performance on large data sets
> with jackrabbit 2.x. I should be done soon if I can gather together a few
> hours to finish up my work. If you would like you can give that a try once
> I finish.
>
> Some other repository settings you might want to look at are:
>  <PersistenceManager
>
> class="org.apache.jackrabbit.core.persistence.pool.DerbyPersistenceManager">
>       <param name="bundleCacheSize" value="256"/>
> </PersistenceManager>
>  <ISMLocking
> class="org.apache.jackrabbit.core.state.FineGrainedISMLocking"/>
>
>
> Hope this helps.
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message