lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doron Cohen <>
Subject Re: Keep hits in results
Date Wed, 06 Sep 2006 06:06:42 GMT
Hits is not really a simple container - it references a certain searcher -
that same searcher that was used to find these hits. When a request for a
result document is made, the Hits object delegates this request to the
searcher. So in order to "page through" the results using an existing Hits
object, that searcher has to remain open. This has a cost - in terms of
memory, etc. But LIA describes an additional penalty of this approach - for
a Web search application, keeping the Hits object alive for the case that
the user would like to see the next "page of results", means "maintaining a
user state on the server", While this is possible, it is a more complicated
server side logic to implement, much more complicated than the stateless
approach where a query is resubmitted for each new page of results. I think
what LIA really means is - here and also in other cases - let's not
complicate things (e.g. statefull solution) unless this is really necessary
(e.g. performance wise in a certain application it is out of the question
to resubmit a query for every page, etc.)

Hope this helps,

"jacky" <> wrote on 05/09/2006 21:56:53:

> hi,
>   The following words are quoted from "lucene in action":
>   "There are a couple of implementation approaches:
>  1. Keep the original Hits and IndexSearcher instances available while
> user is navigating the search results.
>  2. Requery each time the user navigates to a new page.
> It turns out that requerying is most often the best solution.
> Requerying eliminates
> the need to store per-user state. In a web application, staying stateless
> HTTP session) is often desirable. Requerying at first glance seems
awaste, but
> Lucene’s blazing speed more than compensates. "
>    I am confused about this paragraph. Since Hits is just a simple
> container of pointers
> to ranked search results, it doesn't load from the index all
> documents that match a query,
> but only a small portion of them at a time. If we requery, we will
> get a new hits, why not
> just keeping the orginal Hits which will not waste much memory.
>      Best Regards.
>        jacky
View raw message