lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nilesh Bansal" <nileshban...@gmail.com>
Subject Re: Re[2]: Out of memory exception for big indexes
Date Sun, 08 Apr 2007 18:58:32 GMT
On 4/8/07, Artem <abublic@gmail.com> wrote:
> I must note that my patch only helps in lucene-OOM situations related to
> _sorted_ queries. If this is your case than I think yes it will help.
Probably a newbie question, but can you please explain what sorted
queries mean? Is simple keyword search a sorted query?

> In my app currently index is not so big, only 1mln docs. With the patch applied
> sample query giving first 30 of 120,000 sorted results made memory consumption
> jump from 18M to 20M according to jconsole.
>
> NB> It seems that there are some issues with this patch and that was the
> NB> reason it is not yet in the main source tree. Can someone please
> NB> summerize what are the downsides of using such an approach. It will be
> NB> really good if Lucene had it in main source tree and a flag to turn ON
> NB> or OFF this feature.
>
> First there's performance cost (for second and further queries with the
> same IndexSearcher). In default implementation all the index values of sorted
> field are cached during the first sorted search - this takes memory and time;
> but next queries run fast if there still some memory left. My implementation
> doesn't cache field values but loads them from respective documents on the fly -
> so it's slower but takes less memory. The query mentioned took about 3s (with
> rather small sorted fields values - about 20-100 chars).
> There's a limitation also - my implementation requires sorted field to be
> "stored" in index (Field.Store.YES in doc.add())
>
> NB> Bublic, can you tell me what exactly I need to do if I want to use this patch?
>
> You can include StoredFieldSortFactory class source file into your sources and
> then use StoredFieldSortFactory.create(sortFieldName, sortDescending) to get
> Sort object for sorting query.
> StoredFieldSortFactory source file can be extracted from LUCENE-769 patch or
> from sharehound sources: http://sharehound.cvs.sourceforge.net/*checkout*/sharehound/jNetCrawler/src/java/org/apache/lucene/search/StoredFieldSortFactory.java
>
> Regards,
> Artem
>
> NB> thanks
> NB> Nilesh
>
> NB> On 4/6/07, Bublic Online <abublic@gmail.com> wrote:
> >> Hi Ivan, Chris and all!
> >>
> >> I'm that contributor of LUCENE-769 and I recommend it too :)
> >> OutOfMemory error was one of main reasons for me to make it.
> >>
> >> Regards,
> >> Artem Vasiliev
> >>
> >> On 4/6/07, Chris Hostetter <hossman_lucene@fucit.org> wrote:
> >> >
> >> >
> >> > : The problem I suspect is the sorting. As I understand, Lucene
> >> > : builds internal caches for sorting and I suspect that this is the root
> >> > : of your problem. You can test this by trying your problem queries
> >> > : without sorting.
> >> >
> >> > if Sorting really is the cause of your problems, you may want to try out
> >> > this patch...
> >> >
> >> > https://issues.apache.org/jira/browse/LUCENE-769
> >> >
> >> > ...it *may* be advantageous in situations where memory is your most
> >> > constrained resource, and you are willing to sacrifice speed for sorting
> >> > ... it looks promising to me, but there haven't been any convincing
> >> > usecases/benchmarks of people finding it beneficial (other then the
> >> > original contributor)
> >> >
> >> > if you do try it, please post your comments in the issue.
> >> >
> >> >
> >> >
> >> > -Hoss
> >> >
> >> >
> >> > ---------------------------------------------------------------------
> >> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >> >
> >> >
> >>
>
>
>
>
>
> --
> Best regards,
>  Artem                            mailto:abublic@gmail.com
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


-- 
Nilesh Bansal.
http://queens.db.toronto.edu/~nilesh/

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message