lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bhavin Pandya" <bhav...@rediff.co.in>
Subject Re: is there any way to find unique records ?
Date Wed, 22 Nov 2006 12:28:36 GMT
Hi Erick,

Thanks for your help...
I have successfully implemented using custom HitCollector....

- Bhavin pandya

----- Original Message ----- 
From: "Erick Erickson" <erickerickson@gmail.com>
To: <java-user@lucene.apache.org>; "Bhavin Pandya" <bhavinp@rediff.co.in>
Sent: Tuesday, November 21, 2006 8:58 PM
Subject: Re: is there any way to find unique records ?


> Ok, I think I get it now. You're right that you probably don't want to
> iterate the Hits object since that has performance issues once you get
> beyond 100 docs or so. Although, I don't know how big your result sets 
> are.
> If they are guaranteed to be small, this may not matter.
>
> I'm guessing you want to implement a custom HitCollector. That has it's 
> own
> cautions about calling, say, IndexReader.document(id) for each hit, so you
> probably want to use TermDocs object. seek() and skipTo() and doc() are 
> your
> friends. Although I'd try the simple way of just calling
> IndexReader.document(id) first just to see if the performance was
> acceptable. Be sure you're looking at a truly representative data set 
> though
> <G>...
>
> Hope this helps
> Erick
>
> On 11/21/06, Bhavin Pandya <bhavinp@rediff.co.in> wrote:
>>
>> Hi Erick,
>>
>> > If your asking for a list of all the unique values for a particular
>> field,
>> > see TermDocs and/or TermEnum which will allow you to look at, say, all
>> the
>> > values stored for some field. A trick here is to seek (new 
>> > Term("field",
>> > ""));. By putting nothing in the value, you effectively enumerate them
>> > all,
>> > something that I didn't find obvious
>>
>> I think your above solution is very near to what i am looking for ,
>> But little bit different way...
>> here is what i am planning to do...
>>
>> Suppose my index has four fields "product-title" , "product-desc" ,
>> "category" and "FLAG"    ( Fieldname FLAG has value "true" for each n
>> every
>> doc in index ...just added for iteration purpose )
>>
>> At search time.. .
>> query =  +(product-title:nokia) +(product-desc:nokia)
>> Hits hits = searcher.search(query);
>> I want to fetch unique "category" from above hits object...
>>
>> But i dont want to iterate through Hits object....
>>
>> Now As per your suggestions,  I can do something like this...
>> TermEnum  enum = termDocs(new Term("FLAG","true")
>> But it will return enumeration of all the document which is in 
>> index...But
>> i
>> want enumeration of all the document which is relevant to "nokia"...
>> How to . . ?
>>
>> Thanks
>> - Bhavin pandya
>>
>>
>> ----- Original Message -----
>> From: "Erick Erickson" <erickerickson@gmail.com>
>> To: <java-user@lucene.apache.org>; "Bhavin Pandya" <bhavinp@rediff.co.in>
>> Sent: Tuesday, November 21, 2006 7:01 PM
>> Subject: Re: is there any way to find unique records ?
>>
>>
>> > I don't think I understand what "only unique records from a single
>> field"
>> > means.  If it's a unique value in a filed, there'll only be one 
>> > document
>> > in
>> > the hits object and there's no cost to iterating, so I doubt that's 
>> > what
>> > you
>> > mean.
>> >
>> > If your asking for a list of all the unique values for a particular
>> field,
>> > see TermDocs and/or TermEnum which will allow you to look at, say, all
>> the
>> > values stored for some field. A trick here is to seek (new 
>> > Term("field",
>> > ""));. By putting nothing in the value, you effectively enumerate them
>> > all,
>> > something that I didn't find obvious.
>> >
>> > If neither of these are close to the mark, perhaps you could provide
>> more
>> > detail.
>> >
>> > Best
>> > Erick
>> >
>> > On 11/21/06, Bhavin Pandya <bhavinp@rediff.co.in> wrote:
>> >>
>> >> Hi,
>> >> In lucene, is there any way to find only unique records from a single
>> >> field ..?
>> >>
>> >> otherwise unnecessary i have to itereate through Hits and find out
>> >> unique...
>> >>
>> >> plz help..
>> >>
>> >> - Bhavin pandya
>> >>
>> >
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message