Jason,
It's predecessor did, Lucandra. But Solandra is a new approach that manages shards of documents
across the cluster for you and uses solrs distributed search to query indexes.
Jake
On Mar 9, 2011, at 5:15 PM, Jason Rutherglen <jason.rutherglen@gmail.com> wrote:
> Doesn't Solandra partition by term instead of document?
>
> On Wed, Mar 9, 2011 at 2:13 PM, Smiley, David W. <dsmiley@mitre.org> wrote:
>> I was just about to jump in this conversation to mention Solandra and go fig, Solandra's
committer comes in. :-) It was nice to meet you at Strata, Jake.
>>
>> I haven't dug into the code yet but Solandra strikes me as a killer way to scale
Solr. I'm looking forward to playing with it; particularly looking at disk requirements and
performance measurements.
>>
>> ~ David Smiley
>>
>> On Mar 9, 2011, at 3:14 PM, Jake Luciani wrote:
>>
>>> Hi Otis,
>>>
>>> Have you considered using Solandra with Quorum writes
>>> to achieve master/master with CA semantics?
>>>
>>> -Jake
>>>
>>>
>>> On Wed, Mar 9, 2011 at 2:48 PM, Otis Gospodnetic <otis_gospodnetic@yahoo.com
>>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> ---- Original Message ----
>>>>
>>>>> From: Robert Petersen <robertpe@buy.com>
>>>>>
>>>>> Can't you skip the SAN and keep the indexes locally? Then you would
>>>>> have two redundant copies of the index and no lock issues.
>>>>
>>>> I could, but then I'd have the issue of keeping them in sync, which seems
>>>> more
>>>> fragile. I think SAN makes things simpler overall.
>>>>
>>>>> Also, Can't master02 just be a slave to master01 (in the master farm
and
>>>>> separate from the slave farm) until such time as master01 fails? Then
>>>>
>>>> No, because it wouldn't be in sync. It would always be N minutes behind,
>>>> and
>>>> when the primary master fails, the secondary would not have all the docs
-
>>>> data
>>>> loss.
>>>>
>>>>> master02 would start receiving the new documents with an indexes
>>>>> complete up to the last replication at least and the other slaves would
>>>>> be directed by LB to poll master02 also...
>>>>
>>>> Yeah, "complete up to the last replication" is the problem. It's a data
>>>> gap
>>>> that now needs to be filled somehow.
>>>>
>>>> Otis
>>>> ----
>>>> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
>>>> Lucene ecosystem search :: http://search-lucene.com/
>>>>
>>>>
>>>>> -----Original Message-----
>>>>> From: Otis Gospodnetic [mailto:otis_gospodnetic@yahoo.com]
>>>>> Sent: Wednesday, March 09, 2011 9:47 AM
>>>>> To: solr-user@lucene.apache.org
>>>>> Subject: Re: True master-master fail-over without data gaps (choosing
CA
>>>>> in CAP)
>>>>>
>>>>> Hi,
>>>>>
>>>>>
>>>>> ----- Original Message ----
>>>>>> From: Walter Underwood <wunder@wunderwood.org>
>>>>>
>>>>>> On Mar 9, 2011, at 9:02 AM, Otis Gospodnetic wrote:
>>>>>>
>>>>>>> You mean it's not possible to have 2 masters that are in nearly
>>>>> real-time
>>>>>> sync?
>>>>>>> How about with DRBD? I know people use DRBD to keep 2 Hadoop
NNs
>>>>> (their
>>>>>> edit
>>>>>>
>>>>>>> logs) in sync to avoid the current NN SPOF, for example, so
I'm
>>>>> thinking
>>>>>> this
>>>>>>
>>>>>>> could be doable with Solr masters, too, no?
>>>>>>
>>>>>> If you add fault-tolerant, you run into the CAP Theorem. Consistency,
>>>>>
>>>>>> availability, partition: choose two. You cannot have it all.
>>>>>
>>>>> Right, so I'll take Consistency and Availability, and I'll put my 2
>>>>> masters in
>>>>> the same rack (which has redundant switches, power supply, etc.) and
>>>>> thus
>>>>> minimize/avoid partitioning.
>>>>> Assuming the above actually works, I think my Q remains:
>>>>>
>>>>> How do you set up 2 Solr masters so they are in near real-time sync?
>>>>> DRBD?
>>>>>
>>>>> But here is maybe a simpler scenario that more people may be
>>>>> considering:
>>>>>
>>>>> Imagine 2 masters on 2 different servers in 1 rack, pointing to the
same
>>>>> index
>>>>> on the shared storage (SAN) that also happens to live in the same rack.
>>>>> 2 Solr masters are behind 1 LB VIP that indexer talks to.
>>>>> The VIP is configured so that all requests always get routed to the
>>>>> primary
>>>>> master (because only 1 master can be modifying an index at a time),
>>>>> except when
>>>>> this primary is down, in which case the requests are sent to the
>>>>> secondary
>>>>> master.
>>>>>
>>>>> So in this case my Q is around automation of this, around Lucene index
>>>>> locks,
>>>>> around the need for manual intervention, and such.
>>>>> Concretely, if you have these 2 master instances, the primary master
has
>>>>> the
>>>>> Lucene index lock in the index dir. When the secondary master needs
to
>>>>> take
>>>>> over (i.e., when it starts receiving documents via LB), it needs to
be
>>>>> able to
>>>>> write to that same index. But what if that lock is still around? One
>>>>> could use
>>>>> the Native lock to make the lock disappear if the primary master's JVM
>>>>> exited
>>>>> unexpectedly, and in that case everything *should* work and be
>>>>> completely
>>>>> transparent, right? That is, the secondary will start getting new docs,
>>>>> it will
>>>>> use its IndexWriter to write to that same shared index, which won't
be
>>>>> locked
>>>>> for writes because the lock is gone, and everyone will be happy. Did
I
>>>>> miss
>>>>> something important here?
>>>>>
>>>>> Assuming the above is correct, what if the lock is *not* gone because
>>>>> the
>>>>> primary master's JVM is actually not dead, although maybe unresponsive,
>>>>> so LB
>>>>> thinks the primary master is dead. Then the LB will route indexing
>>>>> requests to
>>>>> the secondary master, which will attempt to write to the index, but
be
>>>>> denied
>>>>> because of the lock. So a human needs to jump in, remove the lock,
and
>>>>> manually
>>>>> reindex failed docs if the upstream component doesn't buffer docs that
>>>>> failed to
>>>>> get indexed and doesn't retry indexing them automatically. Is this
>>>>> correct or
>>>>> is there a way to avoid humans here?
>>>>>
>>>>> Thanks,
>>>>> Otis
>>>>> ----
>>>>> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
>>>>> Lucene ecosystem search :: http://search-lucene.com/
>>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> http://twitter.com/tjake
>>
>>
|