lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steven White <swhite4...@gmail.com>
Subject Re: optimize status
Date Mon, 29 Jun 2015 18:17:40 GMT
Thank you guys, this was very helpful.  I was always under the impression
that the index need to be optimize periodically to reclaim disk space
otherwise the index will just keep on growing and growing (was that the
case in Lucene 2.x and prior days?).

I agree with Walter, renaming "optimize" to something else, even “force
merge” is better.  However, make sure it has the proper documentation
explaining what it does and why it's not worthy for live data.

Steve

On Mon, Jun 29, 2015 at 12:37 PM, Reitzel, Charles <
Charles.Reitzel@tiaa-cref.org> wrote:

> Is there really a good reason to consolidate down to a single segment?
>
> Any incremental query performance benefit is tiny compared to the loss of
> managability.
>
> I.e. shouldn't segments _always_ be kept small enough to facilitate
> re-balancing data across shards?   Even in non-cloud instances this is
> true.  When a collection grows, you may want shard/split an existing index
> by adding a node and moving some segments around.    Isn't this the
> direction Solr is going?   With many, smaller segments, this is feasible.
> With "one big segment", the collection must always be reindexed.
>
> Thus, "optimize" would mean, "get rid of all deleted records" and would,
> in fact, optimize queries by eliminating wasted I/O.   Perhaps worth it for
> slowly changing indexes.   Seems like the Tiered merge policy is 90% there
> ...    Or am I all wet (again)?
>
> -----Original Message-----
> From: Walter Underwood [mailto:wunder@wunderwood.org]
> Sent: Monday, June 29, 2015 10:39 AM
> To: solr-user@lucene.apache.org
> Subject: Re: optimize status
>
> "Optimize" is a manual full merge.
>
> Solr automatically merges segments as needed. This also expunges deleted
> documents.
>
> We really need to rename "optimize" to "force merge". Is there a Jira for
> that?
>
> wunder
> Walter Underwood
> wunder@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
> On Jun 29, 2015, at 5:15 AM, Steven White <swhite4141@gmail.com> wrote:
>
> > Hi Upayavira,
> >
> > This is news to me that we should not optimize and index.
> >
> > What about disk space saving, isn't optimization to reclaim disk space
> > or is Solr somehow does that?  Where can I read more about this?
> >
> > I'm on Solr 5.1.0 (may switch to 5.2.1)
> >
> > Thanks
> >
> > Steve
> >
> > On Mon, Jun 29, 2015 at 4:16 AM, Upayavira <uv@odoko.co.uk> wrote:
> >
> >> I'm afraid I don't understand. You're saying that optimising is
> >> causing performance issues?
> >>
> >> Simple solution: DO NOT OPTIMIZE!
> >>
> >> Optimisation is very badly named. What it does is squashes all
> >> segments in your index into one segment, removing all deleted
> >> documents. It is good to get rid of deletes - in that sense the index
> is "optimized".
> >> However, future merges become very expensive. The best way to handle
> >> this topic is to leave it to Lucene/Solr to do it for you. Pretend
> >> the "optimize" option never existed.
> >>
> >> This is, of course, assuming you are using something like Solr 3.5+.
> >>
> >> Upayavira
> >>
> >> On Mon, Jun 29, 2015, at 08:08 AM, Summer Shire wrote:
> >>>
> >>> Have to cause of performance issues.
> >>> Just want to know if there is a way to tap into the status.
> >>>
> >>>> On Jun 28, 2015, at 11:37 PM, Upayavira <uv@odoko.co.uk> wrote:
> >>>>
> >>>> Bigger question, why are you optimizing? Since 3.6 or so, it
> >>>> generally hasn't been requires, even, is a bad thing.
> >>>>
> >>>> Upayavira
> >>>>
> >>>>> On Sun, Jun 28, 2015, at 09:37 PM, Summer Shire wrote:
> >>>>> Hi All,
> >>>>>
> >>>>> I have two indexers (Independent processes ) writing to a common
> >>>>> solr core.
> >>>>> If One indexer process issued an optimize on the core I want the
> >>>>> second indexer to wait adding docs until the optimize has
> >>>>> finished.
> >>>>>
> >>>>> Are there ways I can do this programmatically?
> >>>>> pinging the core when the optimize is happening is returning OK
> >> because
> >>>>> technically
> >>>>> solr allows you to update when an optimize is happening.
> >>>>>
> >>>>> any suggestions ?
> >>>>>
> >>>>> thanks,
> >>>>> Summer
> >>
>
>
> *************************************************************************
> This e-mail may contain confidential or privileged information.
> If you are not the intended recipient, please notify the sender
> immediately and then delete it.
>
> TIAA-CREF
> *************************************************************************
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message