Return-Path: X-Original-To: apmail-lucene-java-user-archive@www.apache.org Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 9EE4E9D8E for ; Sat, 12 May 2012 20:13:20 +0000 (UTC) Received: (qmail 71538 invoked by uid 500); 12 May 2012 20:13:18 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 71476 invoked by uid 500); 12 May 2012 20:13:18 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 71468 invoked by uid 99); 12 May 2012 20:13:18 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 12 May 2012 20:13:18 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [74.125.82.176] (HELO mail-we0-f176.google.com) (74.125.82.176) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 12 May 2012 20:13:13 +0000 Received: by weyr3 with SMTP id r3so1427086wey.35 for ; Sat, 12 May 2012 13:12:51 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:x-gm-message-state; bh=ZyH3eI0JDYezxVXIJchNYUMWwOqvddAWMWUQHHUbZhw=; b=lrgfSs+3ovreVNMy5TLEcbsHgZ3AXvhJgV/G5wOFXfrBpSyLLM/vNqLnDpvB9g9xGc SMd5JkYaBUqXOS40d+xlR5Ov49V/GxVVwVeqUINMm/ybW240yg50NAGzpoF1Z6t7mUVR 43gNBGgeGeYbsO8ISBX7FmbsuVpK3aHZv4lAkBs6NJ55ODaMQIiVihd5FvhWfx4s6XiX MfX8FPsoIaWlbWHvdV7fCGe56qCxTvcAT5d+3AkVcKq620MolKK4odFf4Ah3QHKFDsq7 LgvA67FkWTQCabza0VtSmxN2KNv9LRIZzrZGJeu0mp+WtFqY2j94cZ+9EXrT3nW5Alwk 9Xgg== Received: by 10.180.96.228 with SMTP id dv4mr6468535wib.14.1336853571660; Sat, 12 May 2012 13:12:51 -0700 (PDT) MIME-Version: 1.0 Received: by 10.216.39.9 with HTTP; Sat, 12 May 2012 13:12:31 -0700 (PDT) In-Reply-To: References: From: Michael McCandless Date: Sat, 12 May 2012 16:12:31 -0400 Message-ID: Subject: Re: Lucene's internal doc ID space To: java-user@lucene.apache.org Content-Type: text/plain; charset=ISO-8859-1 X-Gm-Message-State: ALoCoQlVechh5h2bDnEHAQdYWB+bEaPyaUVDVGSDHB+N+nRQrP3wPqPNG4R1/MiJVnU2d/cgld+8 X-Virus-Checked: Checked by ClamAV on apache.org On Sat, May 12, 2012 at 9:12 AM, Valeriy Felberg wrote: >> the Document IDs in Lucene are per segment. ie. they are always >> segment based. > > @Simon I'm just wondering: If the document IDs are per segment how > does it work if I call Searcher.search(Query, int) and get TopDocs > referencing ScoreDocs which contain document IDs? What happens if > there are two matching documents in different segments? How does > Lucene know which segment is meant if I call Searcher.doc(docId) with > some docId from the search result? The per-segment docIDs are "rebased" before Searcher.search returns, ie turned into global docID against the top reader. Also: when a merge runs, it removes any deleted docIDs (thus renumbering all non-deleted docIDs)... Mike McCandless http://blog.mikemccandless.com --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org