lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Questions about Disk space Usage
Date Sat, 29 Oct 2016 16:21:20 GMT
I would also expect a totally empty segment to be merged very quickly
as the percent deleted documents weighs heavily when determining
whether to merge a segment.... but that's based on principle, not deep
code knowledge.

Best,
Erick

On Fri, Oct 28, 2016 at 6:02 PM, Walter Underwood <wunder@wunderwood.org> wrote:
> After the merge. That is what merges do, clean up segments.
>
> I expect it is very rare for a segment to be 100% deleted docs, so it isn’t
> worth handling that case.
>
> wunder
> Walter Underwood
> wunder@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
>
>> On Oct 28, 2016, at 5:54 PM, Alexandre Rafalovitch <arafalov@gmail.com> wrote:
>>
>> Don't the segment that only has deleted documents just gets dropped?
>> Or does it get dropped _after_ the merge and therefore still sits
>> around?
>>
>> Regards,
>>   Alex.
>> ----
>> Solr Example reading group is starting November 2016, join us at
>> http://j.mp/SolrERG
>> Newsletter and resources for Solr beginners and intermediates:
>> http://www.solr-start.com/
>>
>>
>> On 29 October 2016 at 08:53, Walter Underwood <wunder@wunderwood.org> wrote:
>>> It is normal for disk usage to double. Under controlled circumstances,
>>> it can triple, but that probably won’t happen.
>>>
>>> This is the second time today that I’ve sent this information to the list.
>>>
>>> It can use nearly 2X the space whenever the largest segment(s) are
>>> merged, especially if there are only a few smaller segments.
>>>
>>> In order to use 3X the space, you need to:
>>>
>>> 1. Disable merging.
>>> 2. Delete all the documents.
>>> 3. Add all the documents.
>>> 4. Enable merging.
>>>
>>> This causes one complete set of segments that are 100% deletes,
>>> one set that is 0% deletes, then the merge creates another set that
>>> is 0% deletes. During the merge, the old files remain while the
>>> new one is created.
>>>
>>> wunder
>>> Walter Underwood
>>> wunder@wunderwood.org
>>> http://observer.wunderwood.org/  (my blog)
>>>
>>>
>>>> On Oct 28, 2016, at 2:41 PM, Alexandre Rafalovitch <arafalov@gmail.com>
wrote:
>>>>
>>>> 2) Is probably a merge operation. Lucene index segments are not
>>>> rewritable in place, so the merge creates a new file, does everything
>>>> to it, then switches to it.
>>>>
>>>> I remember the number was that the space could temporarily triple
>>>> (?!?) though that may have been before the tiered merge policy.
>>>>
>>>> 3) It should be safe to delete old log files. It is standard log4j stuff.
>>>>
>>>> ----
>>>> Solr Example reading group is starting November 2016, join us at
>>>> http://j.mp/SolrERG
>>>> Newsletter and resources for Solr beginners and intermediates:
>>>> http://www.solr-start.com/
>>>>
>>>>
>>>> On 29 October 2016 at 06:55, Jamal, Sarfaraz
>>>> <Sarfaraz.Jamal@verizonwireless.com.invalid> wrote:
>>>>> Hi Guys,
>>>>>
>>>>> I am currently investigating an instance of Solr's Disk space usage and
I had a few questions I thought you guys might be able to help answer.
>>>>>
>>>>> First Question
>>>>> * There is 30 gb's worth of autosuggest data in the /tmp folder. Each
file is half of a gigabyte
>>>>> Is it safe to delete those files?
>>>>>
>>>>> Second Question
>>>>> Also, we notice that at times the disk runs down to only having a few
gigabytes available, and then goes back to having more space. (the index file literally grows
and then shrinks).
>>>>>
>>>>> Third Question
>>>>> Is it also safe to delete the log files?
>>>>>
>>>>> We run a database indexer on a set interval, perhaps that is relevant
to this discussion.
>>>>>
>>>>> Sas
>>>
>

Mime
View raw message