lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject Re: How to get unique Hits using Multisearcher
Date Tue, 29 Jun 2004 14:23:21 GMT
Ah, you wanted all Hits at once, without iterating, I see.
You;ll have to iterate either using Hits object and its methods, or by
executing the search and passing HitCollector instance to the search
method, depending on what makes sense in your application.

Otis

--- steve <steve@browsermedia.com> wrote:
> Thanks for the quick reply.
> 
> I have played with this just a bit, but it seems there is no quick
> way to
> get the Hits.hitsDocs vector in one swoop. The Hits class is final so
> I
> cannot extend it, and there are no public accessors to get/set the
> hitDocs.
> It seems I will need to loop through Hits.doc(int) and extract the
> hits one
> at a time. Do you know another way to do this?
> 
> 
> ----- Original Message ----- 
> From: "Otis Gospodnetic" <otis_gospodnetic@yahoo.com>
> To: "Lucene Users List" <lucene-user@jakarta.apache.org>
> Sent: Tuesday, June 29, 2004 9:29 AM
> Subject: Re: How to get unique Hits using Multisearcher
> 
> 
> > Steve,
> >
> > Answer to your last question: that's right.
> > Lucene does not know what field you use as your PK (unique document
> > identifier), so it cannot merge hits and remove duplicates.
> >
> > As for converting Vector to Set.... HashSet set = new HashSet();
> > set.addAll(YourVectorInstance);
> > Does that work for you?  Vector isA Collection.
> >
> > Otis
> >
> >
> >
> > --- steve <steve@browsermedia.com> wrote:
> > > I saw a similar - but not identical - question asked earlier in
> the
> > > archive
> > > but no answer.
> > >
> > > I have 2 (or more)  indexes of web url's with intersecting hits.
> The
> > > url's
> > > are defined as keys in case that makes a difference. I am using
> > > MultiSearcher to search multiple indexes, but I get hits repeated
> if
> > > they
> > > exist in both indexes. I am trying to get a set of all unique
> url's
> > > among
> > > the indexes.
> > >
> > > Can MultiSearcher be told not to repeat hits with duplicate "key"
> > > values? Or
> > > does it already do this indicating my Doc's are not defined
> properly?
> > > As a
> > > last resort, can someone recommend an efficient method to convert
> the
> > > Vector
> > > of hitDocs into a Set after the fact?
> > >
> > > FYI - as a test, I used MultiSearcher to search one index and it
> > > found 45
> > > hits. I then gave MultiSearher 2 Searchers pointing to the same
> > > index, and
> > > it found 90 hits. From this I concluded that MultiSearher merely
> adds
> > > hits
> > > to the Vector rather than looking for duplicates. Is that right?
> > >
> > > TIA,
> > >
> > > Steve B.
> > >
> > >
> > >
> > >
> ---------------------------------------------------------------------
> > > To unsubscribe, e-mail:
> lucene-user-unsubscribe@jakarta.apache.org
> > > For additional commands, e-mail:
> lucene-user-help@jakarta.apache.org
> > >
> > >
> >
> >
> >
> ---------------------------------------------------------------------
> > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail:
> lucene-user-help@jakarta.apache.org
> >
> >
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message