lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brendan Grainger <brendan.grain...@gmail.com>
Subject Re: 'Optimizing' Solr Index Size
Date Wed, 07 Aug 2013 14:43:26 GMT
Thanks Erick,  our index is relatively static. I think the deletes must be
coming from 'reindexing' the same documents so definitely handy to recover
the space. I've seen that video before. Definitely very interesting.

Brendan


On Wed, Aug 7, 2013 at 8:04 AM, Erick Erickson <erickerickson@gmail.com>wrote:

> The general advice is to not merge (optimize) unless your
> index is relatively static. You're quite correct, optimizing
> simply recovers the space from deleted documents, otherwise
> it won't change much (except having fewer segments).
>
> Here's a _great_ video that Mike McCandless put together:
>
> http://blog.mikemccandless.com/2011/02/visualizing-lucenes-segment-merges.html
>
> But in general _whenever_ segments are merged, the
> resulting segment will have all the data from deleted docs
> removed, and segments are merged continually when
> data is being added to the index.
>
> Quick-n-dirty way to estimate the space savings
> optimize will give you. Look at the admin page for the core and
> the ratio of deleted docs to numDocs is about the unused
> space that would be regained by an optimize. From there it's
> your call <G>...
>
> Best
> Erick
>
>
> On Tue, Aug 6, 2013 at 12:02 PM, Brendan Grainger <
> brendan.grainger@gmail.com> wrote:
>
> > To maybe answer another one of my questions about the 50Gb recovered when
> > running:
> >
> > curl '
> >
> >
> http://localhost:8983/solr/update?optimize=true&maxSegments=10&waitFlush=false
> > '
> >
> > It looks to me that it was from deleted docs being completely removed
> from
> > the index.
> >
> > Thanks
> >
> >
> >
> > On Tue, Aug 6, 2013 at 11:45 AM, Brendan Grainger <
> > brendan.grainger@gmail.com> wrote:
> >
> > > Well, I guess I can answer one of my questions which I didn't exactly
> > > explicitly state, which is: how do I force solr to merge segments to a
> > > given maximum. I forgot about doing this:
> > >
> > > curl '
> > >
> >
> http://localhost:8983/solr/update?optimize=true&maxSegments=10&waitFlush=false
> > > '
> > >
> > > which reduced the number of segments in my index from 12 to 10.
> > Amazingly,
> > > it also reduced the space used by almost 50Gb. Is that even possible?
> > >
> > > Thanks again
> > > Brendan
> > >
> > >
> > >
> > > On Tue, Aug 6, 2013 at 10:55 AM, Brendan Grainger <
> > > brendan.grainger@gmail.com> wrote:
> > >
> > >> Hi All,
> > >>
> > >> First of all, what I was actually trying to do is actually get a
> little
> > >> space back. So if there is a better way to do this by adjusting the
> > >> MergePolicy or something else please let me know. My index is
> currently
> > >> 200Gb. In the past (Solr 1.4) we've found that optimizing the index
> will
> > >> double the size of the index temporarily then usually when it's done
> we
> > end
> > >> up with a smaller index and slightly faster search query times.
> > >>
> > >> Should I even bother optimizing? My impression was that with the
> > >> TieredMergePolicy this would be less necessary. Would merging segments
> > into
> > >> larger ones save any space and if so is there a way to tell solr to do
> > that?
> > >>
> > >> Thanks
> > >> Brendan
> > >>
> > >
> > >
> > >
> > > --
> > > Brendan Grainger
> > > www.kuripai.com
> > >
> >
> >
> >
> > --
> > Brendan Grainger
> > www.kuripai.com
> >
>



-- 
Brendan Grainger
www.kuripai.com

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message