lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <>
Subject Re: How to get unique Hits using Multisearcher
Date Tue, 29 Jun 2004 13:29:02 GMT

Answer to your last question: that's right.
Lucene does not know what field you use as your PK (unique document
identifier), so it cannot merge hits and remove duplicates.

As for converting Vector to Set.... HashSet set = new HashSet();
Does that work for you?  Vector isA Collection.


--- steve <> wrote:
> I saw a similar - but not identical - question asked earlier in the
> archive
> but no answer.
> I have 2 (or more)  indexes of web url's with intersecting hits. The
> url's
> are defined as keys in case that makes a difference. I am using
> MultiSearcher to search multiple indexes, but I get hits repeated if
> they
> exist in both indexes. I am trying to get a set of all unique url's
> among
> the indexes.
> Can MultiSearcher be told not to repeat hits with duplicate "key"
> values? Or
> does it already do this indicating my Doc's are not defined properly?
> As a
> last resort, can someone recommend an efficient method to convert the
> Vector
> of hitDocs into a Set after the fact?
> FYI - as a test, I used MultiSearcher to search one index and it
> found 45
> hits. I then gave MultiSearher 2 Searchers pointing to the same
> index, and
> it found 90 hits. From this I concluded that MultiSearher merely adds
> hits
> to the Vector rather than looking for duplicates. Is that right?
> TIA,
> Steve B.
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message