lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Kanarsky <alexan...@trulia.com>
Subject Re: Replication Clarification Please
Date Wed, 11 May 2011 22:00:10 GMT
Ravi,

if you have what looks like a full replication each time even if the
master generation is greater than slave, try to watch for the index on
both master and slave the same time to see what files are getting
replicated. You probably may need to adjust your merge factor, as Bill
mentioned. 

-Alexander



On Tue, 2011-05-10 at 12:45 -0400, Ravi Solr wrote:
> Hello Mr. Kanarsky,
>                 Thank you very much for the detailed explanation,
> probably the best explanation I found regarding replication. Just to
> be sure, I wanted to test solr 3.1 to see if it alleviates the
> problems...I dont think it helped. The master index version and
> generation are greater than the slave, still the slave replicates the
> entire index form master (see replication admin screen output below).
> Any idea why it would get the whole index everytime even in 3.1 or am
> I misinterpreting the output ? However I must admit that 3.1 finished
> the replication unlike 1.4.1 which would hang and be backed up for
> ever.
> 
> Master 	http://masterurl:post/solr-admin/searchcore/replication
> 	Latest Index Version:null, Generation: null
> 	Replicatable Index Version:1296217097572, Generation: 12726
> 
> Poll Interval 	00:03:00
> 
> Local Index 	Index Version: 1296217097569, Generation: 12725
> 
> 	Location: /data/solr/core/search-data/index
> 	Size: 944.32 MB
> 	Times Replicated Since Startup: 148
> 	Previous Replication Done At: Tue May 10 12:32:42 EDT 2011
> 	Config Files Replicated At: null
> 	Config Files Replicated: null
> 	Times Config Files Replicated Since Startup: null
> 	Next Replication Cycle At: Tue May 10 12:35:41 EDT 2011
> 
> Current Replication Status 	Start Time: Tue May 10 12:32:41 EDT 2011
> 	Files Downloaded: 18 / 108
> 	Downloaded: 317.48 KB / 436.24 MB [0.0%]
> 	Downloading File: _ayu.nrm, Downloaded: 4 bytes / 4 bytes [100.0%]
> 	Time Elapsed: 17s, Estimated Time Remaining: 23902s, Speed: 18.67 KB/s
> 
> 
> Thanks,
> Ravi Kiran Bhaskar
> 
> On Tue, May 10, 2011 at 4:10 AM, Alexander Kanarsky
> <alexander@trulia.com> wrote:
> > Ravi,
> >
> > as far as I remember, this is how the replication logic works (see
> > SnapPuller class, fetchLatestIndex method):
> >
> >> 1. Does the Slave get the whole index every time during replication or
> >> just the delta since the last replication happened ?
> >
> >
> > It look at the index version AND the index generation. If both slave's
> > version and generation are the same as on master, nothing gets
> > replicated. if the master's generation is greater than on slave, the
> > slave fetches the delta files only (even if the partial merge was done
> > on the master) and put the new files from master to the same index
> > folder on slave (either index or index.<timestamp>, see further
> > explanation). However, if the master's index generation is equals or
> > less than one on slave, the slave does the full replication by
> > fetching all files of the master's index and place them into a
> > separate folder on slave (index.<timestamp>). Then, if the fetch is
> > successfull, the slave updates (or creates) the index.properties file
> > and puts there the name of the "current" index folder. The "old"
> > index.<timestamp> folder(s) will be kept in 1.4.x - which was treated
> > as a bug - see SOLR-2156 (and this was fixed in 3.1). After this, the
> > slave does commit or reload core depending whether the config files
> > were replicated. There is another bug in 1.4.x that fails replication
> > if the slave need to do the full replication AND the config files were
> > changed - also fixed in 3.1 (see SOLR-1983).
> >
> >> 2. If there are huge number of queries being done on slave will it
> >> affect the replication ? How can I improve the performance ? (see the
> >> replications details at he bottom of the page)
> >
> >
> > >From my experience the half of the replication time is a time when the
> > transferred data flushes to the disk. So the IO impact is important.
> >
> >> 3. Will the segment names be same be same on master and slave after
> >> replication ? I see that they are different. Is this correct ? If it
> >> is correct how does the slave know what to fetch the next time i.e.
> >> the delta.
> >
> >
> > They should be the same. The slave fetches the changed files only (see
> > above), also look at SnapPuller code.
> >
> >> 4. When and why does the index.<TIMESTAMP> folder get created ? I see
> >> this type of folder getting created only on slave and the slave
> >> instance is pointing to it.
> >
> >
> > See above.
> >
> >> 5. Does replication process copy both the index and index.<TIMESTAMP>
> > folder ?
> >
> >
> > index.<timestamp> folder gets created only of the full replication
> > happened at least once. Otherwise, the slave will use the index
> > folder.
> >
> >> 6. what happens if the replication kicks off even before the previous
> >> invocation has not completed ? will the 2nd invocation block or will
> >> it go through causing more confusion ?
> >
> >
> > There is a lock (snapPullLock in ReplicationHandler) that prevents two
> > replications run simultaneously. If there is no bug, it should just
> > return silently from the replication call. (I personally never had
> > problem with this so it looks there is no bug :)
> >
> >> 7. If I have to prep a new master-slave combination is it OK to copy
> >> the respective contents into the new master-slave and start solr ? or
> >> do I have have to wipe the new slave and let it replicate from its new
> >> master ?
> >
> >
> > If the new master has a different index, the slave will create a new
> > <index.timestamp> folder. There is no need to wipe it.
> >
> >> 8. Doing an 'ls | wc -l' on index folder of master and slave gave 194
> >> and 17968 respectively...I slave has lot of segments_xxx files. Is
> >> this normal ?
> >
> >
> > No, it looks like in your case the slave continues to replicate to the
> > same folder for a long time period but the old files are not getting
> > deleted bu some reason. Try to restart the slave or do core reload on
> > it to see if the old segments gone.
> >
> > -Alexander
> >
> >



Mime
View raw message