lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Summer Shire <shiresum...@gmail.com>
Subject Re: optimize status
Date Tue, 30 Jun 2015 03:48:27 GMT
Hi Upayavira and Erick,

There are two things we are talking about here.

First: Why am I optimizing? If I don’t our SEARCH (NOT INDEXING) performance is 100% worst.

The problem lies in the number of total segments. We have to have max segments 1 or 2. 
I have done intensive performance related tests around number of segments, merge factor or
changing the Merge policy.

Second: Solr does not perform better for me without an optimize. So now that I have to optimize
the second issue
is updating concurrently during an optimize. If I update when an optimize is happening the
optimize takes 5 times as long as
the normal optimize.

So is there any way other than creating a postOptimize hook and writing the status in a file
and somehow making it available to the indexer. 
All of this just sounds traumatic :) 

Thanks
Summer


> On Jun 29, 2015, at 5:40 AM, Erick Erickson <erickerickson@gmail.com> wrote:
> 
> Steven:
> 
> Yes, but....
> 
> First, here's Mike McCandles' excellent blog on segment merging:
> http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html
> 
> I think the third animation is the TieredMergePolicy. In short, yes an
> optimize will reclaim disk space. But as you update, this is done for
> you anyway. About the only time optimizing is at all beneficial is
> when you have a relatively static index. If you're continually
> updating documents, and by that I mean replacing some existing
> documents, then you'll immediately start generating "holes" in your
> index.
> 
> And if you _do_ optimize, you wind up with a huge segment. And since
> the default policy tries to merge segments of roughly the same size,
> it accumulates deletes for quite a while before they merged away.
> 
> And if you don't update existing docs or delete docs, then there's no
> wasted space anyway.
> 
> Summer:
> 
> First off, why do you care about not updating during optimizing?
> There's no good reason you have to worry about that, you can freely
> update while optimizing.
> 
> But frankly I have to agree with Upayavira that on the face of it
> you're doing a lot of extra work. See above, but you optimize while
> indexing, so immediately you're rather defeating the purpose.
> Personally I'd only optimize relatively static indexes and, by
> definition, you're index isn't static since the second process is just
> waiting to modify it.
> 
> Best,
> Erick
> 
> On Mon, Jun 29, 2015 at 8:15 AM, Steven White <swhite4141@gmail.com> wrote:
>> Hi Upayavira,
>> 
>> This is news to me that we should not optimize and index.
>> 
>> What about disk space saving, isn't optimization to reclaim disk space or
>> is Solr somehow does that?  Where can I read more about this?
>> 
>> I'm on Solr 5.1.0 (may switch to 5.2.1)
>> 
>> Thanks
>> 
>> Steve
>> 
>> On Mon, Jun 29, 2015 at 4:16 AM, Upayavira <uv@odoko.co.uk> wrote:
>> 
>>> I'm afraid I don't understand. You're saying that optimising is causing
>>> performance issues?
>>> 
>>> Simple solution: DO NOT OPTIMIZE!
>>> 
>>> Optimisation is very badly named. What it does is squashes all segments
>>> in your index into one segment, removing all deleted documents. It is
>>> good to get rid of deletes - in that sense the index is "optimized".
>>> However, future merges become very expensive. The best way to handle
>>> this topic is to leave it to Lucene/Solr to do it for you. Pretend the
>>> "optimize" option never existed.
>>> 
>>> This is, of course, assuming you are using something like Solr 3.5+.
>>> 
>>> Upayavira
>>> 
>>> On Mon, Jun 29, 2015, at 08:08 AM, Summer Shire wrote:
>>>> 
>>>> Have to cause of performance issues.
>>>> Just want to know if there is a way to tap into the status.
>>>> 
>>>>> On Jun 28, 2015, at 11:37 PM, Upayavira <uv@odoko.co.uk> wrote:
>>>>> 
>>>>> Bigger question, why are you optimizing? Since 3.6 or so, it generally
>>>>> hasn't been requires, even, is a bad thing.
>>>>> 
>>>>> Upayavira
>>>>> 
>>>>>> On Sun, Jun 28, 2015, at 09:37 PM, Summer Shire wrote:
>>>>>> Hi All,
>>>>>> 
>>>>>> I have two indexers (Independent processes ) writing to a common
solr
>>>>>> core.
>>>>>> If One indexer process issued an optimize on the core
>>>>>> I want the second indexer to wait adding docs until the optimize
has
>>>>>> finished.
>>>>>> 
>>>>>> Are there ways I can do this programmatically?
>>>>>> pinging the core when the optimize is happening is returning OK
>>> because
>>>>>> technically
>>>>>> solr allows you to update when an optimize is happening.
>>>>>> 
>>>>>> any suggestions ?
>>>>>> 
>>>>>> thanks,
>>>>>> Summer
>>> 


Mime
View raw message