Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id D4481200BB2 for ; Sat, 29 Oct 2016 18:22:08 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id D2CD6160AF4; Sat, 29 Oct 2016 16:22:08 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id F3A20160AE3 for ; Sat, 29 Oct 2016 18:22:07 +0200 (CEST) Received: (qmail 88228 invoked by uid 500); 29 Oct 2016 16:22:06 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 88216 invoked by uid 99); 29 Oct 2016 16:22:06 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 29 Oct 2016 16:22:06 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 9C190C1939 for ; Sat, 29 Oct 2016 16:22:05 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.38 X-Spam-Level: X-Spam-Status: No, score=0.38 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id kLC0bU4FVbfU for ; Sat, 29 Oct 2016 16:22:02 +0000 (UTC) Received: from mail-it0-f41.google.com (mail-it0-f41.google.com [209.85.214.41]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id C4D0F5FAF9 for ; Sat, 29 Oct 2016 16:22:01 +0000 (UTC) Received: by mail-it0-f41.google.com with SMTP id e187so21819068itc.0 for ; Sat, 29 Oct 2016 09:22:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-transfer-encoding; bh=KXEzIkRBwYdM2ugIKCpYNo43rFo3I7hWjuhqsBWTRP4=; b=Erf4ESaVeXJlGhij6dxGY+/eybXmzC06L/4qNS7fAg9H4SF7jT0GDCiJHIb6w1Ad8M l0jPhG6S3ZZ+sRQt7o9GpyhFKFLhSP0Le+jEEICMU3BNSpAd5E9kLABu5MUHAuQvePQe JS/IeJpK98zuYAmOt754VnjybDKtZSiYQvlFAQ0Lq3ZqPSRALu20TFD7mtUZMkNqWYm4 0CDPjquFlbfThVTg5ttb27Q/4UBKzOPmucuaBsrNNxHDU3ZKhb47Y7DCavEtie3BA49l JIMKzTQDSYP3pqmdRf/+KzpUzp/e9IzdQ1mRO/RU2GqT8l3LqLgoO6LyeCvrYM/VzWAM B1uw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:content-transfer-encoding; bh=KXEzIkRBwYdM2ugIKCpYNo43rFo3I7hWjuhqsBWTRP4=; b=WC9StY1u8vIemgnE9pRnFZ2GbvRJdNn5TKRM6kCMC5Ws2SgE4mnTeE39aNpOt/R4Iv Jl3b7souQZQQwhBmqjKheBiV6AHTX5gux49fuC4zW3Yl+6wfu96edxeFD8S0Ar5lj6Xm zlWNJ0QRbOSXZ0OratDY+cg0l0ziUCj0lVPYLgVwLrvILcYf0r0X2YsjcKDVK9kFOnkJ BAJYf53eSJWVpiKDmWPZyBNNXDyh0toGRdIHbfaEf5wmG+Kh58RgzgapLaJFaKaPn8RT loDFO28NAEdjbtQhNYR08kobw0pVdi1Mvv1u4p0J6mvj3VLhgVA255R88EMRfWRllHVw guCQ== X-Gm-Message-State: ABUngvcyBu4NzqnrGe9uZvstWCfWDhfU9A1WvO3xja0A/yK2pRPBIEJkv0gWvZuK9aad9gwZ9e0CyGnz6nqtSA== X-Received: by 10.107.46.25 with SMTP id i25mr1908054ioo.145.1477758120548; Sat, 29 Oct 2016 09:22:00 -0700 (PDT) MIME-Version: 1.0 Received: by 10.107.129.86 with HTTP; Sat, 29 Oct 2016 09:21:20 -0700 (PDT) In-Reply-To: <528A0C28-9A40-46C5-9BC7-8EF05BD1B9BE@wunderwood.org> References: <6c5f2b5679c24ce49926edb4b861feb8@OMZP1LUMXCA11.uswin.ad.vzwcorp.com> <49CEA69D-9DD2-4728-B804-87D74784E6D6@wunderwood.org> <528A0C28-9A40-46C5-9BC7-8EF05BD1B9BE@wunderwood.org> From: Erick Erickson Date: Sat, 29 Oct 2016 09:21:20 -0700 Message-ID: Subject: Re: Questions about Disk space Usage To: solr-user Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable archived-at: Sat, 29 Oct 2016 16:22:09 -0000 I would also expect a totally empty segment to be merged very quickly as the percent deleted documents weighs heavily when determining whether to merge a segment.... but that's based on principle, not deep code knowledge. Best, Erick On Fri, Oct 28, 2016 at 6:02 PM, Walter Underwood w= rote: > After the merge. That is what merges do, clean up segments. > > I expect it is very rare for a segment to be 100% deleted docs, so it isn= =E2=80=99t > worth handling that case. > > wunder > Walter Underwood > wunder@wunderwood.org > http://observer.wunderwood.org/ (my blog) > > >> On Oct 28, 2016, at 5:54 PM, Alexandre Rafalovitch = wrote: >> >> Don't the segment that only has deleted documents just gets dropped? >> Or does it get dropped _after_ the merge and therefore still sits >> around? >> >> Regards, >> Alex. >> ---- >> Solr Example reading group is starting November 2016, join us at >> http://j.mp/SolrERG >> Newsletter and resources for Solr beginners and intermediates: >> http://www.solr-start.com/ >> >> >> On 29 October 2016 at 08:53, Walter Underwood wr= ote: >>> It is normal for disk usage to double. Under controlled circumstances, >>> it can triple, but that probably won=E2=80=99t happen. >>> >>> This is the second time today that I=E2=80=99ve sent this information t= o the list. >>> >>> It can use nearly 2X the space whenever the largest segment(s) are >>> merged, especially if there are only a few smaller segments. >>> >>> In order to use 3X the space, you need to: >>> >>> 1. Disable merging. >>> 2. Delete all the documents. >>> 3. Add all the documents. >>> 4. Enable merging. >>> >>> This causes one complete set of segments that are 100% deletes, >>> one set that is 0% deletes, then the merge creates another set that >>> is 0% deletes. During the merge, the old files remain while the >>> new one is created. >>> >>> wunder >>> Walter Underwood >>> wunder@wunderwood.org >>> http://observer.wunderwood.org/ (my blog) >>> >>> >>>> On Oct 28, 2016, at 2:41 PM, Alexandre Rafalovitch wrote: >>>> >>>> 2) Is probably a merge operation. Lucene index segments are not >>>> rewritable in place, so the merge creates a new file, does everything >>>> to it, then switches to it. >>>> >>>> I remember the number was that the space could temporarily triple >>>> (?!?) though that may have been before the tiered merge policy. >>>> >>>> 3) It should be safe to delete old log files. It is standard log4j stu= ff. >>>> >>>> ---- >>>> Solr Example reading group is starting November 2016, join us at >>>> http://j.mp/SolrERG >>>> Newsletter and resources for Solr beginners and intermediates: >>>> http://www.solr-start.com/ >>>> >>>> >>>> On 29 October 2016 at 06:55, Jamal, Sarfaraz >>>> wrote: >>>>> Hi Guys, >>>>> >>>>> I am currently investigating an instance of Solr's Disk space usage a= nd I had a few questions I thought you guys might be able to help answer. >>>>> >>>>> First Question >>>>> * There is 30 gb's worth of autosuggest data in the /tmp folder. Each= file is half of a gigabyte >>>>> Is it safe to delete those files? >>>>> >>>>> Second Question >>>>> Also, we notice that at times the disk runs down to only having a few= gigabytes available, and then goes back to having more space. (the index f= ile literally grows and then shrinks). >>>>> >>>>> Third Question >>>>> Is it also safe to delete the log files? >>>>> >>>>> We run a database indexer on a set interval, perhaps that is relevant= to this discussion. >>>>> >>>>> Sas >>> >