Return-Path: Delivered-To: apmail-jakarta-lucene-user-archive@www.apache.org Received: (qmail 3073 invoked from network); 29 Jun 2004 13:04:53 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur-2.apache.org with SMTP; 29 Jun 2004 13:04:53 -0000 Received: (qmail 49538 invoked by uid 500); 29 Jun 2004 13:04:44 -0000 Delivered-To: apmail-jakarta-lucene-user-archive@jakarta.apache.org Received: (qmail 49505 invoked by uid 500); 29 Jun 2004 13:04:43 -0000 Mailing-List: contact lucene-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Users List" Reply-To: "Lucene Users List" Delivered-To: mailing list lucene-user@jakarta.apache.org Received: (qmail 49480 invoked by uid 99); 29 Jun 2004 13:04:43 -0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received: from [198.87.233.73] (HELO yoda.browsermedia.com) (198.87.233.73) by apache.org (qpsmtpd/0.27.1) with ESMTP; Tue, 29 Jun 2004 06:04:40 -0700 Received: from project.browsermedia.com ([209.190.230.4] helo=steveb) by yoda.browsermedia.com with asmtp (Exim 4.34 #1 (OpenNA Linux)) id 1BfIHr-0007l0-4w for ; Tue, 29 Jun 2004 09:04:35 -0400 Message-ID: <00e601c45dd9$5a246ec0$fe00a8c0@BROWSERMEDIA.com> From: "steve" To: "lucene-user" Subject: How to get unique Hits using Multisearcher Date: Tue, 29 Jun 2004 09:02:36 -0400 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1409 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1409 X-Spam-Score: 0.0 (/) X-Virus-Checked: Checked X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N I saw a similar - but not identical - question asked earlier in the archive but no answer. I have 2 (or more) indexes of web url's with intersecting hits. The url's are defined as keys in case that makes a difference. I am using MultiSearcher to search multiple indexes, but I get hits repeated if they exist in both indexes. I am trying to get a set of all unique url's among the indexes. Can MultiSearcher be told not to repeat hits with duplicate "key" values? Or does it already do this indicating my Doc's are not defined properly? As a last resort, can someone recommend an efficient method to convert the Vector of hitDocs into a Set after the fact? FYI - as a test, I used MultiSearcher to search one index and it found 45 hits. I then gave MultiSearher 2 Searchers pointing to the same index, and it found 90 hits. From this I concluded that MultiSearher merely adds hits to the Vector rather than looking for duplicates. Is that right? TIA, Steve B. --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-user-help@jakarta.apache.org