lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From rahul_k123 <vishnudee...@gmail.com>
Subject Re: Replicating Lucene Index with out SOLR
Date Fri, 29 Aug 2008 00:00:25 GMT


Do i need to stop indexing when i  rsync snapshot to the slave?





Otis Gospodnetic wrote:
> 
> Yes, I think you pinpointed what I see over and over with Solr.  The two
> desires pull in opposite directions.  I think Jason Rutherglen is very
> keen to start talking about Lucene clusters and index replication in such
> clusters without using the classic master/slave approach.
> 
> Jason, want to start a thread on java-dev?
> 
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> 
> 
> 
> ----- Original Message ----
>> From: mark harwood <markharw00d@yahoo.co.uk>
>> To: java-user@lucene.apache.org
>> Sent: Thursday, August 28, 2008 6:21:19 AM
>> Subject: Re: Replicating Lucene Index with out SOLR
>> 
>> >> You don't need to copy the whole index every time
>> >> if you do incremental  indexing/updates and don't optimize the index
>> 
>> 
>> But at 5 minute intervals for replication does this not quickly lead to a
>> very 
>> fragmented index?
>> 
>> It seems there is a fundamental conflict when building replication
>> systems based 
>> entirely on the lucene file format:
>> * In the interests of good search performance the index should ideally be
>> a 
>> small number of large files (which is what mergepolicy/optimize are all
>> about 
>> maintaining)
>> * However, in the interest of minimising replication network traffic, the
>> ideal 
>> is a large number of small files.
>> 
>> I've previously built replication systems which rely on each server
>> pulling 
>> deltas in the form of insert/update/delete records from a database and
>> using 
>> IndexWriter locally on each server to apply these sets of changes.
>> Obviously 
>> this duplicates the analyzing/indexing effort across replicas but does
>> mean the 
>> content being transferred is not restricted by the design of the Lucene
>> file 
>> format and therefore uses minimal network traffic and places no
>> restrictions on 
>> the IndexWriter merge policies I may choose to use to optimise search
>> speed.
>> 
>> Keen to explore the pros and cons of these different replication schemes.
>> 
>> Cheers,
>> Mark
>> 
>> 
>> 
>> --- On Thu, 28/8/08, rahul_k123 wrote:
>> 
>> > From: rahul_k123 
>> > Subject: Re: Replicating Lucene Index with out SOLR
>> > To: java-user@lucene.apache.org
>> > Date: Thursday, 28 August, 2008, 6:47 AM
>> > Can i make use of solr scripts for this purpose.
>> > 
>> > 
>> > The snapinstaller runs on the slave after a snapshot has
>> > been pulled from
>> > the master. This signals the local Solr server to open a
>> > new index reader,
>> > then auto-warming of the cache(s) begins (in the new
>> > reader), while other
>> > requests continue to be served by the original index
>> > reader.
>> > 
>> > How can i achieve the above in my case??
>> > 
>> > 
>> > Otis Gospodnetic wrote:
>> > > 
>> > > You don't need to copy the whole index every time
>> > if you do incremental
>> > > indexing/updates and don't optimize the index
>> > before copying.  If you use
>> > > rsync for copying the index, only the new/modified
>> > files be copied.  This
>> > > is what Solr replication scripts do, too.
>> > > 
>> > > Otis
>> > > --
>> > > Sematext -- http://sematext.com/ -- Lucene - Solr -
>> > Nutch
>> > > 
>> > > 
>> > > 
>> > > ----- Original Message ----
>> > >> From: rahul_k123 
>> > >> To: general@lucene.apache.org
>> > >> Sent: Wednesday, August 27, 2008 11:36:07 PM
>> > >> Subject: Re: Replicating Lucene Index with out
>> > SOLR
>> > >> 
>> > >> 
>> > >> Currently we index every certain amount of time on
>> > A.
>> > >> 
>> > >> -copy the index
>> > >>      Copying the whole index everytime ? 
>> > >> 
>> > >> Currently i am investigating how i can make use of
>> > SOLR replication
>> > >> scripts
>> > >> to achive this.
>> > >> 
>> > >> 
>> > >> Is there anyone who did this with out SOLR before?
>> > >> 
>> > >> 
>> > >> Thanks
>> > >> 
>> > >> 
>> > >> 
>> > >> Otis Gospodnetic wrote:
>> > >> > 
>> > >> > Hi,
>> > >> > 
>> > >> > You may want to ask on the java-user list
>> > (more subscribers), which I'm
>> > >> > CC-ing, so we can continue discussion there.
>> > >> > I think you will have to implement your own
>> > logic that runs on A and
>> > >> does
>> > >> > something like this:
>> > >> > 
>> > >> > - stop adding new docs
>> > >> > - call commit on the IndexWriter
>> > >> > 
>> > >> > - copy the index
>> > >> > - resume indexing
>> > >> > 
>> > >> > Otis
>> > >> > --
>> > >> > Sematext -- http://sematext.com/ -- Lucene -
>> > Solr - Nutch
>> > >> > 
>> > >> > 
>> > >> > 
>> > >> > ----- Original Message ----
>> > >> >> From: rahul_k123 
>> > >> >> To: general@lucene.apache.org
>> > >> >> Sent: Thursday, August 28, 2008 1:34:41
>> > AM
>> > >> >> Subject: Replicating Lucene Index with
>> > out SOLR
>> > >> >> 
>> > >> >> 
>> > >> >> I have the following requirement
>> > >> >> 
>> > >> >> Right now we have multiple indexes 
>> > serving our web application. Our
>> > >> >> indexes
>> > >> >> are around 30 GB size.
>> > >> >> 
>> > >> >> We want to replicate the index data so
>> > that we can use them to
>> > >> distribute
>> > >> >> the search load.
>> > >> >> 
>> > >> >> This is what we need ideally.
>> > >> >> 
>> > >> >> A – (supports writes and reads)
>> > >> >> 
>> > >> >> A1 –Replicated Index (Supports reads) 
>> > . We want to synchronize this
>> > >> >> every 5
>> > >> >> mins.
>> > >> >> 
>> > >> >> 
>> > >> >> 
>> > >> >> Any help is appreciated.   We are not
>> > using SOLR
>> > >> >> 
>> > >> >> I also interested in knowing what will be
>> > the best way so that I can
>> > >> >> scale
>> > >> >> my application adding more boxes for
>> > search if our load increases.
>> > >> >> 
>> > >> >> Thanks.  
>> > >> >> 
>> > >> >> -- 
>> > >> >> View this message in context: 
>> > >> >> 
>> > >>
>> > 
>> http://www.nabble.com/Replicating-Lucene-Index-with-out-SOLR-tp19191752p19191752.html
>> > >> >> Sent from the Lucene - General mailing
>> > list archive at Nabble.com.
>> > >> > 
>> > >> > 
>> > >> > 
>> > >> 
>> > >> -- 
>> > >> View this message in context: 
>> > >>
>> > 
>> http://www.nabble.com/Replicating-Lucene-Index-with-out-SOLR-tp19191752p19193670.html
>> > >> Sent from the Lucene - General mailing list
>> > archive at Nabble.com.
>> > > 
>> > > 
>> > >
>> > ---------------------------------------------------------------------
>> > > To unsubscribe, e-mail:
>> > java-user-unsubscribe@lucene.apache.org
>> > > For additional commands, e-mail:
>> > java-user-help@lucene.apache.org
>> > > 
>> > > 
>> > > 
>> > 
>> > -- 
>> > View this message in context:
>> > 
>> http://www.nabble.com/Replicating-Lucene-Index-with-out-SOLR-tp19193696p19194576.html
>> > Sent from the Lucene - Java Users mailing list archive at
>> > Nabble.com.
>> > 
>> > 
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail:
>> > java-user-unsubscribe@lucene.apache.org
>> > For additional commands, e-mail:
>> > java-user-help@lucene.apache.org
>> 
>> 
>> Send instant messages to your online friends
>> http://uk.messenger.yahoo.com
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Replicating-Lucene-Index-with-out-SOLR-tp19193696p19211497.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message