lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <e...@ehatchersolutions.com>
Subject Re: Confused about boolean query and how an IndexReader is associated with Hits
Date Wed, 07 Dec 2005 15:10:24 GMT

On Dec 7, 2005, at 9:56 AM, Alan Chandler wrote:

> Erik Hatcher writes:
>> On Dec 7, 2005, at 7:06 AM, Alan Chandler wrote:
>>> Erik Hatcher writes:
>>>> On Dec 7, 2005, at 2:38 AM, Alan Chandler wrote:
>>>>> Worse than that, when I attempt to access Hits.doc(0) I am   
>>>>> getting an
>>>>> immediate IOException with the message "Bad file descriptor".   
>>>>> I   think
>>> ...
>>>> You must keep your IndexSearcher instance alive and well when   
>>>> working  with Hits.  Hits internally uses the searcher to page   
>>>> through results  - it does not keep all results in memory.  I'm   
>>>> not sure why you  aren't seeing all the documents you expect,  
>>>> but  if you package it up  as a simple RAMDirectory-using JUnit   
>>>> TestCase then I'd be happy to  run it and see.
>>> mmm!  Going to have to rething my "Database" interface, so that  
>>> I  actually get the page of results I need coupled with the search.
>>> How do you get over the fact that the hits may be on several web   
>>> pages and the user may go away between getting the hits and   
>>> actually retrieving a document in detail.  Do you have to  
>>> serialize  the searcher and put it into a session?
>> One option is to just carry along the key to the document,  
>> generally  something unique like an "id" field.  When a request  
>> comes in for a  document, it would pass it's document id, not the  
>> hit number.  Then  simply search using a TermQuery for that  
>> document, or use IndexReader  to navigate to it.
>
> Yes, I am doing all of that for the majority of what I am doing.   
> The tricky one is getting an index of all my documents stored.   
> This could be more than one pages worth.
> But I found a copy of chapter 3 of your book on the web, where you  
> recommend that you re-do the query for each page (I assume by  
> querying and then calculating n based on page number and  
> items_per_page to the call hits.doc(n);

Ah, sorry, my initial response was not about paging through search  
results.  Yes, re-searching to page through searches is most commonly  
the best first approach.  And I've never had to choose a different  
approach as it has always proven plenty fast enough.  It isn't  
necessary to use the same IndexSearcher instance for all requests in  
this case, but it is wise to keep it around if the index hasn't changed.

The Luuuuuuucene navigation at the bottom of the results from  
lucenebook.com uses this technique of just multiplying the page  
number requested by the number of items per page and using that as an  
offset for the Hits enumeration for each page.

	Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message