Return-Path: X-Original-To: apmail-lucene-solr-user-archive@minotaur.apache.org Delivered-To: apmail-lucene-solr-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3657B18B1E for ; Tue, 30 Jun 2015 03:48:45 +0000 (UTC) Received: (qmail 98366 invoked by uid 500); 30 Jun 2015 03:48:40 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 98299 invoked by uid 500); 30 Jun 2015 03:48:40 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 98282 invoked by uid 99); 30 Jun 2015 03:48:40 -0000 Received: from Unknown (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 30 Jun 2015 03:48:40 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id D4EB418275D for ; Tue, 30 Jun 2015 03:48:39 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.1 X-Spam-Level: X-Spam-Status: No, score=-0.1 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-us-east.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id c4UDqjOdwVtj for ; Tue, 30 Jun 2015 03:48:31 +0000 (UTC) Received: from mail-pd0-f171.google.com (mail-pd0-f171.google.com [209.85.192.171]) by mx1-us-east.apache.org (ASF Mail Server at mx1-us-east.apache.org) with ESMTPS id ADC4747BE8 for ; Tue, 30 Jun 2015 03:48:30 +0000 (UTC) Received: by pdcu2 with SMTP id u2so128186445pdc.3 for ; Mon, 29 Jun 2015 20:48:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=content-type:mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to; bh=urbAoFf+PNTGvrT1t9Q9vFeJLfZroeTC+eGZCZcqUMs=; b=ZaNkkXEMokCo3W29tvSRYHcjOFmBpWAK+MnRZqMIF8DbuIiipFZUBzMj312Iz9OWRx jJPvFYrvZ/QvGdj/V1MVwNj2WxdsW5LCIiFbTCjjla8dyTVHd2BDMz+Eq1AD/xwW+xuE ktnj86uUVsrOXRDlklO7RHmUgz59kuzVyoHUzEMqFIIAcJfGlLGGDKxePFSG9MzAvyo0 llqmydNFPMgCtpIH9wBrwE8jgi2BLaL4Ivzsto6I5NcQaS6XWi6rZb/0y3LGFmuG4eu2 BpAG2xhW4raoGJSJJLSddwVWpdCYT2zbxPY6ZkV94xpsiEzMYBVCMFO5wswUNi+efNAe 9Lag== X-Received: by 10.66.156.68 with SMTP id wc4mr39040851pab.126.1435636109773; Mon, 29 Jun 2015 20:48:29 -0700 (PDT) Received: from [192.168.1.12] (208.65.182.6.static.etheric.net. [208.65.182.6]) by mx.google.com with ESMTPSA id ez4sm9592624pbb.13.2015.06.29.20.48.28 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Mon, 29 Jun 2015 20:48:29 -0700 (PDT) Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 8.2 \(2098\)) Subject: Re: optimize status From: Summer Shire In-Reply-To: Date: Mon, 29 Jun 2015 20:48:27 -0700 Content-Transfer-Encoding: quoted-printable Message-Id: <2D7CB74D-EB20-48AD-B7D0-6668FD617CC1@gmail.com> References: <1ED42941-678F-44ED-AE68-519349172ABE@gmail.com> <1435559848.3713228.310202401.7FC93A41@webmail.messagingengine.com> <4732223F-99A5-4A27-BEAD-B92DCA691D38@gmail.com> <1435565770.1077738.310257457.39CDFE48@webmail.messagingengine.com> To: solr-user@lucene.apache.org X-Mailer: Apple Mail (2.2098) Hi Upayavira and Erick, There are two things we are talking about here. First: Why am I optimizing? If I don=E2=80=99t our SEARCH (NOT INDEXING) = performance is 100% worst.=20 The problem lies in the number of total segments. We have to have max = segments 1 or 2.=20 I have done intensive performance related tests around number of = segments, merge factor or changing the Merge policy. Second: Solr does not perform better for me without an optimize. So now = that I have to optimize the second issue is updating concurrently during an optimize. If I update when an = optimize is happening the optimize takes 5 times as long as the normal optimize. So is there any way other than creating a postOptimize hook and writing = the status in a file and somehow making it available to the indexer.=20 All of this just sounds traumatic :)=20 Thanks Summer > On Jun 29, 2015, at 5:40 AM, Erick Erickson = wrote: >=20 > Steven: >=20 > Yes, but.... >=20 > First, here's Mike McCandles' excellent blog on segment merging: > = http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.= html >=20 > I think the third animation is the TieredMergePolicy. In short, yes an > optimize will reclaim disk space. But as you update, this is done for > you anyway. About the only time optimizing is at all beneficial is > when you have a relatively static index. If you're continually > updating documents, and by that I mean replacing some existing > documents, then you'll immediately start generating "holes" in your > index. >=20 > And if you _do_ optimize, you wind up with a huge segment. And since > the default policy tries to merge segments of roughly the same size, > it accumulates deletes for quite a while before they merged away. >=20 > And if you don't update existing docs or delete docs, then there's no > wasted space anyway. >=20 > Summer: >=20 > First off, why do you care about not updating during optimizing? > There's no good reason you have to worry about that, you can freely > update while optimizing. >=20 > But frankly I have to agree with Upayavira that on the face of it > you're doing a lot of extra work. See above, but you optimize while > indexing, so immediately you're rather defeating the purpose. > Personally I'd only optimize relatively static indexes and, by > definition, you're index isn't static since the second process is just > waiting to modify it. >=20 > Best, > Erick >=20 > On Mon, Jun 29, 2015 at 8:15 AM, Steven White = wrote: >> Hi Upayavira, >>=20 >> This is news to me that we should not optimize and index. >>=20 >> What about disk space saving, isn't optimization to reclaim disk = space or >> is Solr somehow does that? Where can I read more about this? >>=20 >> I'm on Solr 5.1.0 (may switch to 5.2.1) >>=20 >> Thanks >>=20 >> Steve >>=20 >> On Mon, Jun 29, 2015 at 4:16 AM, Upayavira wrote: >>=20 >>> I'm afraid I don't understand. You're saying that optimising is = causing >>> performance issues? >>>=20 >>> Simple solution: DO NOT OPTIMIZE! >>>=20 >>> Optimisation is very badly named. What it does is squashes all = segments >>> in your index into one segment, removing all deleted documents. It = is >>> good to get rid of deletes - in that sense the index is "optimized". >>> However, future merges become very expensive. The best way to handle >>> this topic is to leave it to Lucene/Solr to do it for you. Pretend = the >>> "optimize" option never existed. >>>=20 >>> This is, of course, assuming you are using something like Solr 3.5+. >>>=20 >>> Upayavira >>>=20 >>> On Mon, Jun 29, 2015, at 08:08 AM, Summer Shire wrote: >>>>=20 >>>> Have to cause of performance issues. >>>> Just want to know if there is a way to tap into the status. >>>>=20 >>>>> On Jun 28, 2015, at 11:37 PM, Upayavira wrote: >>>>>=20 >>>>> Bigger question, why are you optimizing? Since 3.6 or so, it = generally >>>>> hasn't been requires, even, is a bad thing. >>>>>=20 >>>>> Upayavira >>>>>=20 >>>>>> On Sun, Jun 28, 2015, at 09:37 PM, Summer Shire wrote: >>>>>> Hi All, >>>>>>=20 >>>>>> I have two indexers (Independent processes ) writing to a common = solr >>>>>> core. >>>>>> If One indexer process issued an optimize on the core >>>>>> I want the second indexer to wait adding docs until the optimize = has >>>>>> finished. >>>>>>=20 >>>>>> Are there ways I can do this programmatically? >>>>>> pinging the core when the optimize is happening is returning OK >>> because >>>>>> technically >>>>>> solr allows you to update when an optimize is happening. >>>>>>=20 >>>>>> any suggestions ? >>>>>>=20 >>>>>> thanks, >>>>>> Summer >>>=20