Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 97462 invoked from network); 28 Oct 2009 12:51:31 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 28 Oct 2009 12:51:31 -0000 Received: (qmail 65025 invoked by uid 500); 28 Oct 2009 12:51:29 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 64944 invoked by uid 500); 28 Oct 2009 12:51:28 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 64934 invoked by uid 99); 28 Oct 2009 12:51:28 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 28 Oct 2009 12:51:28 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of torindan@gmail.com designates 209.85.221.203 as permitted sender) Received: from [209.85.221.203] (HELO mail-qy0-f203.google.com) (209.85.221.203) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 28 Oct 2009 12:51:19 +0000 Received: by qyk41 with SMTP id 41so450722qyk.29 for ; Wed, 28 Oct 2009 05:50:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type; bh=FUmTOP9eh7VH6l3P5hZQKsS4L8PpcmH86VMXMuEz0gQ=; b=sDU6GxvRKp8wzLV/aa680HGqJ9o3TCHCJOMitk/HwihACbwOZm4Pnr4rezafYwMI0q x0nVCQfqQvKy658ER2Dc6GrUgmzSVSlb9xFRx7FQiicDq1OYuXCc16MI9GCr4BpvY0kN TX9ySZDKE465sFIyhgiOV4gThNHKSevyc6R9w= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=ZaQvsJg3iDfXXitYhWyJNYGwErQn/CkzkxjhWw8jZSlu47RifOXFfQ7pRifziAzicB ypZHnRaYQkBedH8qK0oor8Hu0ki7jUEn6EAb1deQblIUU7PyW5M0ZlSdPn1UppMyylRf j0K1zxGQLqP89NMZ/CviquwfRHaB6kh+eAE+w= MIME-Version: 1.0 Received: by 10.229.9.130 with SMTP id l2mr836598qcl.41.1256734258776; Wed, 28 Oct 2009 05:50:58 -0700 (PDT) In-Reply-To: <26093125.post@talk.nabble.com> References: <26093125.post@talk.nabble.com> Date: Wed, 28 Oct 2009 14:50:58 +0200 Message-ID: <2ffb6d060910280550v2469e2b3v89355d39c0196613@mail.gmail.com> Subject: Re: Adding segments to an optimized index From: =?UTF-8?B?RGFuaWwgxaJPUklO?= To: java-user@lucene.apache.org Content-Type: text/plain; charset=UTF-8 X-Virus-Checked: Checked by ClamAV on apache.org There is no such thing in lucene as "unique" doc. They might be unique from your application point of view (have some ID that is unique) >From lucene's point of view it's perfectly fine to have duplicate documents. So the "deleted" documents in combined index are coming from your second index. Even more: if you search your combined index you'll see that there are duplicate documents that came from 1st index and were not deleted. That's because lucene simply adds to combined index all documents that aren't marked as deleted. Remember that document is (kind of) opaque to lucene and it doesn't have (and doesn't need) any logic to handle such situations, these should be handled by your application. On Wed, Oct 28, 2009 at 13:36, Marc Sturlese wrote: > > I am doing some test with optimize and adding segments and I am wondering if > someone knows if what I am doing can give document inconsistency. > I have 2 folders with one index each. One have a non optimized index1 with 1 > milion docs and a mergeFactor=10. The other one, index2 has the same index > optimized with compound file. I add and delete some docuements in the no > optimized index1. And a few segements desapear and somew are created. I now > I copy the new created files in the optimized index2 and optimized it again. > I get no errors doing that but... docuemenst will be the same in index1 and > index2? I am asking because when I added some docs and delete others in > index1 some segments desapear and index2 is suposed to still have that > segements optimized with the others... or it doesn't work this way? > > What I try to explain is: > > index1: > seg1,seg2,seg3,seg4,seg5 > index2: (index1 optimized with compound) > seg8 > > adding and deleteting docs to index1 will get: > seg1,seg2,seg3,seg6 (seg4 and seg5 have desapeared and seg6 has been > created) > now I do in index2: > seg8+seg6+optimize=seg9 (but seg8 is suposed to still contain seg4 and seg5) > > The question is: index1 (seg1,seg2,seg3,seg6) and index2(seg9) will contain > the same docs?? > > Thanks in advance and let me know if I wasn't clear in my explanation > please. > -- > View this message in context: http://www.nabble.com/Adding-segments-to-an-optimized-index-tp26093125p26093125.html > Sent from the Lucene - Java Users mailing list archive at Nabble.com. > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org