lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Cabansag, Ronald-Alvin R" <>
Subject RE: lucene norms cached twice
Date Fri, 29 Oct 2010 19:32:00 GMT
We create a single instance of IndexReader using MMapDirectory(file))and
a single instance of IndexSearcher using this index reader.

Searches in our application are done with two operations. The first operation gets the hit
count. We use a QueryWrapperFilter.getDocIdSet(indexReader) to get the DocIdSet and compute
the hit count using its iterator. This is where we see the ReadOnlyDirectoryReader caching
its own copy of norms.

The second operation is where we actually do the search using TermQuery(...),
filter, collector) method. This is where the sub-reader caches its own copy of norms.


-----Original Message-----
From: Michael McCandless [] 
Sent: Friday, October 29, 2010 1:27 PM
Subject: Re: lucene norms cached twice

Norms should not normally be loaded twice.

Since 2.9, searching is done at the sub-reader level, and so norms
should never be loaded for the main reader.

But can you describe how you're using Lucene?


On Fri, Oct 29, 2010 at 9:27 AM, Cabansag, Ronald-Alvin R
<> wrote:
> We are working with a large readonly lucene index(single segment) with large number of
fields and documents and are running into memory usage problems.
> We found that when using a ReadOnlyDirectoryReader and IndexSearcher created using the
same reader, the norms are cached twice - first by the reader itself and second by the reader's
subreaders. Is there an easy way to avoid having the norms cached twice when we only have
a single subreader?
> We thought of the following options:
> 1.) pass in the main reader as a subreader when creating the  IndexSearcher?  ( e.g.
new IndexSearcher(mainReader, IndexReader[] {mainReader}, int[] {0} )
> 2.) override ReadOnlyDirectoryReader.getSequentialSubReaders() method and return null.
This tells the IndexSearcher to use the main reader- ReadOnlyDirectoryReader.
> 3.) use SegmentReader.get(boolean, SegmentInfo, int) to create a ReadOnlySegmentReader
that we use as our main reader instead.
> Are there any negative implications to the above approaches? Or are there better approaches
to the problem?
> Thanks in advance for any help.
> Alvin Cab
> Cengage Learning
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message