lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Friedland, Zachary (EDS - Strategy)" <zachary_friedl...@ml.com>
Subject RE: Ideal Index Fragmentation
Date Fri, 02 Sep 2005 13:31:26 GMT
Follow-up questions below denoted with "--"

Thanks,
Zach

-----Original Message-----
From: Erik Hatcher [mailto:erik@ehatchersolutions.com] 
Sent: Wednesday, August 31, 2005 12:25 PM
To: java-user@lucene.apache.org
Subject: Re: Ideal Index Fragmentation


On Aug 30, 2005, at 9:53 PM, Friedland, Zachary (EDS - Strategy) wrote:
>> More assorted questions:
>>
>> *    I have been reading the posts on using Filter vs.  
>> BooleanQuery.  To implement a search-within-a-search, it seems the  
>> Filter is advantageous due to its cacheability, but are there other  
>> pros or cons that should be considered (memory, speed, etc).

>It's only advantageous when the Filters are long-lived over multiple  
>searches.  It's not really recommended for search-within-search when  
>the initial search is transient.  Combining queries within a  
>BooleanQuery is more recommended in that case.

--Great!  This one is closed down...

>> *    I'm interested in implementing a "dynamic filter" component  
>> that will walk through the hits[] object and pull out distinct  
>> values for certain fields to display as search-within-a-search  
>> options (all of them will return at least one result since they are  
>> in the hits[]).  Has anyone implemented something like this -- how  
>> did it work out?

>Walking all Hits and extracting a field is an expensive operation, so  
>be forewarned on that.

--OK, is there a preferred strategy for generating lists of distinct
attributes in the hit[]?  I've seen Hoss' post about using QueryFilters,
but that assumes that you know what values you want to count; but I
won't know the domain of values to expect in every field...  Can I get
creative with the hitsCollector to solve this one?

>> *    When using a ParallelMultiSearcher, is there an easy way to  
>> know how many matches came from each index searched?  I'd like to  
>> be able to display how many of each object are in the combined hits 
>> [].  Since I'm storing one object type per index, the count of hits  
>> from each index will give me that number.

>I'm not sure if you can get at each indexes results in that way, but  
>you can tell which index an individual hit came from (see API docs  
>for details on that).

--Has anyone else hacked the ParallelMultisearcher to do something like
a hits.length before recombining them?

>> *    Does anyone know when 1.9/2.0 will be released?

>Your guess is as good as any at this point.  I'm not sure what is  
>still left to be done for a 1.9 release.  I haven't heard any prods  
>to release it any time soon, but you're welcome to push us to  
>consider it if need be.

>     Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
--------------------------------------------------------

If you are not an intended recipient of this e-mail, please notify the sender, delete it and
do not read, act upon, print, disclose, copy, retain or redistribute it. Click here for important
additional terms relating to this e-mail.     http://www.ml.com/email_terms/
--------------------------------------------------------

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message