lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ravi Solr <ravis...@gmail.com>
Subject Re: Replication Clarification Please
Date Wed, 11 May 2011 13:25:55 GMT
Mr. Bell,
     Thank you for your help. Yes, the full index replicated every
1000, 10000, 100000 etc, if mergeFactor is 10 as per it's definition.
We do index every 5 minutes and replicate every 3 minutes just to make
sure consumers have  immediate access to the indexed docs.

Thanks,

Ravi Kiran Bhaskar

On Wednesday, May 11, 2011, Bill Bell <billnbell@gmail.com> wrote:
> OK let me rephrase.
>
> In solrconfig.xml there is a setting called mergeFactor. The default is
> usually 10.
> Practically it means there are 10 segments. If you are doing fast delta
> indexing (adding a couple documents, then committing),
> You will cycle through all 10 segments pretty fast.
>
> It appears that if you do go past the 10 segments without replicating, the
> only recourse is for the replicator to do a full index replication instead
> of a delta index replication...
>
> Does that help?
>
>
> On 5/9/11 9:24 AM, "Ravi Solr" <ravisolr@gmail.com> wrote:
>
>>Hello Mr. Bell,
>>                   Thank you very much for patiently responding to my
>>questions. We optimize once in every 2 days. Can you kindly rephrase
>>your answer, I could not understand - "if the amount of time if > 10
>>segments, I believe that might also trigger a whole index, since you
>>cycled all the segments.In that case I think you might want to
>>increase the mergeFactor."
>>
>>The current index folder details and sizes are given below
>>
>>MASTER
>>--------------
>>   5K   search-data/spellchecker2
>> 480M  search-data/index
>>   5K   search-data/spellchecker1
>>   5K   search-data/spellcheckerFile
>> 480M   search-data
>>
>>SLAVE
>>----------
>>   2K   search-data/index.20110509103950
>> 419M   search-data/index
>> 2.3G   search-data/index.20110429042508  ----> SLAVE is pointing to
>>this directory
>>   5K   search-data/spellchecker1
>>   5K  search-data/spellchecker2
>>   5K   search-data/spellcheckerFile
>> 2.7G   search-data
>>
>>Thanks,
>>
>>Ravi Kiran Bhaskar
>>
>>On Sat, May 7, 2011 at 11:49 PM, Bill Bell <billnbell@gmail.com> wrote:
>>> I did not see answers... I am not an authority, but will tell you what I
>>> think....
>>>
>>> Did you get some answers?
>>>
>>>
>>> On 5/6/11 2:52 PM, "Ravi Solr" <ravisolr@gmail.com> wrote:
>>>
>>>>Hello,
>>>>        Pardon me if this has been already answered somewhere and I
>>>>apologize for a lengthy post. I was wondering if anybody could help me
>>>>understand Replication internals a bit more. We have a single
>>>>master-slave setup (solr 1.4.1) with the configurations as shown
>>>>below. Our environment is quite commit heavy (almost 100s of docs
>>>>every 5 minutes), and all indexing is done on Master and all searches
>>>>go to the Slave. We are seeing that the slave replication performance
>>>>gradually decreases and the speed decreases < 1kbps and ultimately
>>>>gets backed up. Once we reload the core on slave it will be work fine
>>>>for sometime and then it again gets backed up. We have mergeFactor set
>>>>to 10 and ramBufferSizeMB is set to 32MB and solr itself is running
>>>>with 2GB memory and locktype is simple on both master and slave.
>>>
>>> How big is your index? How many rows and GB ?
>>>
>>> Every time you replicate, there are several resets on caching. So if you
>>> are constantly
>>> Indexing, you need to be careful on how that performance impact will
>>>apply.
>>>
>>>>
>>>>I am hoping that the following questions might help me understand the
>>>>replication performance issue better (Replication Configuration is
>>>>given at the end of the email)
>>>>
>>>>1. Does the Slave get the whole index every time during replication or
>>>>just the delta since the last replication happened ?
>>>
>>>
>>> It depends. If you do an OPTIMIZE every time your index, then you will
>>>be
>>> sending the whole index down.
>>> If the amount of time if > 10 segments, I believe that might also
>>>trigger
>>> a whole index, since you cycled all the segments.
>>> In that case I think you might want to increase the mergeFactor.
>>>
>>>
>>>>
>>>>2. If there are huge number of queries being done on slave will it
>>>>affect the replication ? How can I improve the performance ? (see the
>>>>replications details at he bottom of the page)
>>>
>>> It seems that might be one way the you get the index.* directories. At
>>> least I see it more frequently when there is huge load and you are
>>>trying
>>> to replicate.
>>> You could replicate less frequently.
>>>
>>>>
>>>>3. Will the segment names be same be same on master and slave after
>

Mime
View raw message