lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject Re: How to get unique Hits using Multisearcher
Date Tue, 29 Jun 2004 13:29:02 GMT
Steve,

Answer to your last question: that's right.
Lucene does not know what field you use as your PK (unique document
identifier), so it cannot merge hits and remove duplicates.

As for converting Vector to Set.... HashSet set = new HashSet();
set.addAll(YourVectorInstance);
Does that work for you?  Vector isA Collection.

Otis



--- steve <steve@browsermedia.com> wrote:
> I saw a similar - but not identical - question asked earlier in the
> archive
> but no answer.
> 
> I have 2 (or more)  indexes of web url's with intersecting hits. The
> url's
> are defined as keys in case that makes a difference. I am using
> MultiSearcher to search multiple indexes, but I get hits repeated if
> they
> exist in both indexes. I am trying to get a set of all unique url's
> among
> the indexes.
> 
> Can MultiSearcher be told not to repeat hits with duplicate "key"
> values? Or
> does it already do this indicating my Doc's are not defined properly?
> As a
> last resort, can someone recommend an efficient method to convert the
> Vector
> of hitDocs into a Set after the fact?
> 
> FYI - as a test, I used MultiSearcher to search one index and it
> found 45
> hits. I then gave MultiSearher 2 Searchers pointing to the same
> index, and
> it found 90 hits. From this I concluded that MultiSearher merely adds
> hits
> to the Vector rather than looking for duplicates. Is that right?
> 
> TIA,
> 
> Steve B.
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message