lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nader Henein <...@bayt.net>
Subject Re: Lucene in clustered environment (Tomcat)
Date Fri, 10 Jun 2005 18:01:01 GMT
Considering you have all your servers on one machine a simple memory failure and the whole
thing goes south. But you're right, we have an independent Lucene index sitting next to each
one of our webservers on each machine, but they are all updated from a central location powered
and organized by an application that accesses our persistent store on an 
oracle database and creates XML files which are then copied to each of the Lucene servers
and indexed, if the central utility fails, then the backup kicks in, at worst the indecies
aren't up to date for as long as it takes to point the webservers to the Oracle Standby. I
wrote a preliminary paper (will send you separately coz the mailing list doesn't allow attachments)
about Lucene strategies in a clustered environment, this is a bout 6 months old, I've gone
a long way since and I'm finalizing a newer version which I hope to publish so as to offer
a solid case study to anyone out there taking that step. Once again this paper is old, but
it should get you going.

Nader Henein



Ben wrote:

>Wouldn't it defeat the purpose of clustering if you have a single
>server to manage a single index? What would happen if this server
>failed?
>
>Cheers,
>Ben
>
>On 6/8/05, Ben <newreaders@gmail.com> wrote:
>  
>
>>How about using JavaGroups to notify other nodes in the cluster about
>>the changes?
>>
>>Essentially, each node has the same index stored in a different
>>location. When one node updates/deletes a record, other nodes will get
>>a notification about the changes and update their index accordingly?
>>By using this method, I don't have to modify my Lucene code, I just
>>need to add additional code to notify other nodes. I believe this
>>method also scales better.
>>
>>Cheers,
>>Ben
>>
>>
>>On 6/7/05, Nader Henein <nsh@bayt.net> wrote:
>>    
>>
>>>I realize I've already asked you this question, but do you need 100%
>>>real time, because you could run batch them every 2 minutes, and
>>>concerning Parallel search, unless you really need it, it's overkill in
>>>this case, a communal index will serve you well and will be much easier
>>>to maintain. You have to way requirement vs. complexity/ debug time.
>>>
>>>Nader Henein
>>>
>>>Ben wrote:
>>>
>>>      
>>>
>>>>>When you say your cluster is on a single machine, do you mean that you
have multiple webservers on the same machine all of which search a single Lucene index?
>>>>>
>>>>>
>>>>>          
>>>>>
>>>>Yes, this is my case.
>>>>
>>>>
>>>>
>>>>        
>>>>
>>>>>Do you use Lucene as your persistent store or do you have a DB back there?
>>>>>
>>>>>
>>>>>          
>>>>>
>>>>I use Lucene to search for data stored in a PostgreSQL server.
>>>>
>>>>
>>>>
>>>>        
>>>>
>>>>>what is your current update/delete strategy because real time inserts
from the webservers directly to the index will not work because you can't have multiple writers.
>>>>>
>>>>>
>>>>>          
>>>>>
>>>>I have to do this in real time, what are the available solutions? My
>>>>application has the ability to do batch update/delete to a Lucene
>>>>index but I would like to do this in real time.
>>>>
>>>>One solution I am thinking is to have each cluster has it own index
>>>>and use parallel search. This makes my application even more complex.
>>>>
>>>>
>>>>
>>>>        
>>>>
>>>>>I strongly recommend Quartz, it's rock solid and really versatile.
>>>>>
>>>>>
>>>>>          
>>>>>
>>>>I am using Quartz, it is really great and supports cluster.
>>>>
>>>>Thanks,
>>>>Ben
>>>>
>>>>
>>>>On 6/7/05, Nader Henein <nsh@bayt.net> wrote:
>>>>
>>>>
>>>>        
>>>>
>>>>>When you say your cluster is on a single machine, do you mean that you
>>>>>have multiple webservers on the same machine all of which search a
>>>>>single Lucene index? Because if that's the case, your solution is
>>>>>simple, as long as you persist to a single DB and then designate one of
>>>>>your servers (or even another server) to update/delete the index. Do you
>>>>>use Lucene as your persistent store or do you have a DB back there? and
>>>>>what is your current update/delete strategy because real time inserts
>>>>>          
>>>>>
>>>>>from the webservers directly to the index will not work because you
>>>>        
>>>>
>>>>>can't have multiple writers. Updating a dirty flag on rows that need to
>>>>>be indexed/deleted, or using a table for this task and then batching
>>>>>your updates would be ideal, and if you're using server specific
>>>>>scheduling, I strongly recommend Quartz, it's rock solid and really
>>>>>versatile.
>>>>>
>>>>>My two cents.
>>>>>
>>>>>Nader Henein
>>>>>
>>>>>
>>>>>Ben wrote:
>>>>>
>>>>>
>>>>>
>>>>>          
>>>>>
>>>>>>My cluster is on a single machine and I am using FS index.
>>>>>>
>>>>>>I have already integrated Lucene into my web application for use in
a
>>>>>>non-clustered environment. I don't know what I need to do to make
it
>>>>>>work in a clustered environment.
>>>>>>
>>>>>>Thanks,
>>>>>>Ben
>>>>>>
>>>>>>On 6/7/05, Nader Henein <nsh@bayt.net> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>            
>>>>>>
>>>>>>>IMHO, Issues that you need to consider
>>>>>>>
>>>>>>>  * Atomicity of updates and deletes if you are using multiple
indexes
>>>>>>>    on multiple machines (the case if your cluster is over a wide
network)
>>>>>>>  * Scheduled indecies to core data comparison and sanitization
>>>>>>>    (intensive)
>>>>>>>
>>>>>>>This all depends on what the volume of change is on your index
and
>>>>>>>whether you'll be using a Memory resident index or an FS index.
>>>>>>>
>>>>>>>This should start the ball rolling, we've been using Lucene successfully
>>>>>>>on a distributed cluster for a while now, and as long as you're
aware of
>>>>>>>some basic NDS limitations/constraints you should be fine.
>>>>>>>
>>>>>>>Hope this helps
>>>>>>>
>>>>>>>Nader Henein
>>>>>>>
>>>>>>>Ben wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>              
>>>>>>>
>>>>>>>>Hi
>>>>>>>>
>>>>>>>>I would like to use Lucene in a clustered environment, what
are the
>>>>>>>>things that I should consider and do?
>>>>>>>>
>>>>>>>>I would like to use the same ordinary index storage for all
the nodes
>>>>>>>>in the the cluster, possible?
>>>>>>>>
>>>>>>>>Thanks,
>>>>>>>>Ben
>>>>>>>>
>>>>>>>>---------------------------------------------------------------------
>>>>>>>>To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>>>For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>                
>>>>>>>>
>>>>>>>--
>>>>>>>
>>>>>>>Nader S. Henein
>>>>>>>Senior Applications Architect
>>>>>>>
>>>>>>>Bayt.com
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>---------------------------------------------------------------------
>>>>>>>To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>>For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>              
>>>>>>>
>>>>>>---------------------------------------------------------------------
>>>>>>To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>            
>>>>>>
>>>>>--
>>>>>
>>>>>Nader S. Henein
>>>>>Senior Applications Architect
>>>>>
>>>>>Bayt.com
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>---------------------------------------------------------------------
>>>>>To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>          
>>>>>
>>>>---------------------------------------------------------------------
>>>>To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>        
>>>>
>>>--
>>>
>>>Nader S. Henein
>>>Senior Applications Architect
>>>
>>>Bayt.com
>>>
>>>
>>>
>>>
>>>
>>>---------------------------------------------------------------------
>>>To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>>      
>>>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>
>
>
>
>
>  
>

-- 
Nader S. Henein
Senior Applications Developer

Bayt.com





---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message