lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Solr Segments, Segment Merges,Optimize
Date Sun, 23 Feb 2014 00:36:52 GMT
1> It Depends. Soft commits will not add a new segment. Hard commits
with openSearcher=true or false _will_ create a new segment.
2> There are, but you'll have to dig.
3> Well, I'd ask a counter-question. Are you seeing unacceptable
performance? If not, why worry? :)

A better answer is that 24-28 segments is not at all unusual.

By and large, don't bother with optimize/force merge. What I would do is
look at the admin screen and note the percentage of deleted documents.
If it's above some arbitrary number (I typically use 15-20%) and _stays_
there, consider optimizing.

However! There is a parameter you can explicitly set in solrconfig.xml
(sorry, which one escapes me now) that increases the "weight" of the %
deleted documents when the merge policy decides which segments
to merge. Upping this number will have the effect of more aggressively
merging segments with a greater % of deleted docs. But these are already
pretty heavily weighted for merging already...


Best,
Erick


On Sat, Feb 22, 2014 at 1:23 PM, KNitin <nitin.tnvl@gmail.com> wrote:

> Hi
>
>   I have the following questions
>
>
>    1. I have a job that runs for 3-4 hours continuously committing data to
>    a collection with auto commit of 30 seconds. Does it mean that every 30
>    seconds I would get a new solr segment ?
>    2. My current segment merge policy is set to 10. Will merger always
>    continue running in the background to reduce the segments ? Is there a
> way
>    to see metrics regarding segment merging from solr (mbeans or any other
>    way)?
>    3. A few of my collections are very large with around 24-28 segments per
>    shard and around 16 shards. Is it bad to have this many segments for a
>    shard for a collection? Is it a good practice to optimize the index very
>    often or just rely on segment merges alone?
>
>
>
> Thanks for the help in advance
> Nitin
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message