lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler" <...@thetaphi.de>
Subject RE: Pagination and Sorting
Date Thu, 01 Oct 2009 11:40:10 GMT
Hallo Chris,

You are using TopDocs incorrectly. Normally you use *not* Integer.MAX_VALUE,
as the upper bound of your pagination window as numer of documents. So if
user wants to display documents 90 to 100, just set the number to 100 docs.
If the user then goes to docs 100 to 110, just reexecute ther query with a
larger value. This is exactly how search engines work, user gets the most
relevant docs presented first and then goes forward in pagination, but most
users will not go beyond the first 5 or 10 pages. So an initial max value of
100 is good. If somebody goes further, just raise and reexecute query.

If you want to collect all Hots, do not use TopDocs, instead use a
Collector.

This is the correct way for both sorting and not sorting, only that with
sorting, the memory usage is bigger. Internally an array of
Integer.MAX_VALUE is allocated, which leads to the OOM.

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de

> -----Original Message-----
> From: Christian Robert [mailto:cr_usenet@arcor.de]
> Sent: Thursday, October 01, 2009 1:17 PM
> To: java-user@lucene.apache.org
> Subject: Pagination and Sorting
> 
> Hello everybody,
> 
> I'm looking at quite an interesting challenge right now, so I
> hope that somebody out there will be able to assist me.
> 
> What I'm trying to do is returning search results both sorted and
> paginated. So far I haven't been able to come up with a working solution.
> 
> Pagination without sorting is no problem - simply looping through the
> document identifiers and grabbing the documents within my "pagination
> window".
> 
> Sorting is also no problem - create the Sort object and executing the
> search.
> 
> However, when using a Sort I always get a TopDocs object in which all
> the identifiers of the documents are contained. No problem when dealing
> with a small index, but I've some 700.000 documents indexed and each
> time I'm trying to call
> 
>   TopDocs topDocs =
>     indexSearcher.search(query, null, Integer.MAX_VALUE, sort);
> 
> I'm ending up with an OutOfMemoryError.
> 
> So it seems that Lucene needs all the documents loaded into memory for
> the sorting to work where in contrast I only want to load the documents
> to be displayed in the current pagination window.
> 
> Am I lost here and have to find another way or is there a working
> solution to combine sorting and pagination?
> 
> Thanks in advance!
> Chris
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message