zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Hunt <ph...@apache.org>
Subject Re: observers in occasionally disconnected data centers
Date Fri, 06 May 2011 16:32:57 GMT
Mahadev is working with Giri to address. The jenkins folks are saying
this is a machine administered by Yahoo and the issue needs to be
address with them (their admins, but Mahadev/Giri are looking into it
from our (zk) side).

Patrick

On Fri, May 6, 2011 at 4:33 AM, Ketan Gangatirkar <ketan@indeed.com> wrote:
> Hi, Patrick.  Were you able to get any assistance from the hudson
> admins?  Thanks.
>
> On Wed, May 4, 2011 at 12:53 PM, Patrick Hunt <phunt@apache.org> wrote:
>> This is odd, it's failing in the c tests but for a weird reason:
>>
>> in:
>> https://builds.apache.org/hudson/job/PreCommit-ZOOKEEPER-Build/247/artifact/trunk/build/tmp/zk.log
>>
>> it says:
>> /grid/0/hudson/hudson-slave/workspace/PreCommit-ZOOKEEPER-Build/trunk/src/c/tests/zkServer.sh:
>> line 115: java: command not found
>>
>> I'll ping the hudson admins and see if this is a known issue (also
>> hudson is very slow today for some reason).
>>
>> Once that's addressed we should be good to go.
>>
>> Patrick
>>
>> On Wed, May 4, 2011 at 9:57 AM, Ketan Gangatirkar <ketan@indeed.com> wrote:
>>> Got the patch formatted right and applying successfully, now I'll see
>>> if I can figure out the unit test failure.
>>>
>>> On Wed, May 4, 2011 at 11:26 AM, Patrick Hunt <phunt@apache.org> wrote:
>>>> Hi Ketan, the patch is failing to apply
>>>> https://builds.apache.org/hudson/job/PreCommit-ZOOKEEPER-Build/246//console
>>>>
>>>> Looks like you used git, I usually do something like:
>>>> git diff rev1..rev2 --no-prefix > ZOOKEEPER-784.patch
>>>> can you give it another try?
>>>>
>>>> Patrick
>>>>
>>>> On Tue, May 3, 2011 at 6:42 PM, Ketan Gangatirkar <ketan@indeed.com>
wrote:
>>>>> I have updated Sergey's patch to:
>>>>>
>>>>> * apply to current trunk
>>>>> * incorporate one trivial output change he made to StatCommand in
>>>>> NettyServerCnxn.java
>>>>> * change log4j references to slf4j
>>>>>
>>>>> I have successfully run ant releaseaudit on the result.  The updated
>>>>> patch is now attached to the issue:
>>>>>
>>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-784
>>>>>
>>>>> I do *not* make any claim to have understood the contents of this
>>>>> patch; all I did was synch everything and fix the obvious log4j/slf4j
>>>>> change.  Now what?
>>>>>
>>>>>
>>>>> On Tue, May 3, 2011 at 5:46 PM, Patrick Hunt <phunt@apache.org>
wrote:
>>>>>> The core tests failed on last hudson, I just kicked off a patch build,
>>>>>> seems recent changes (logging?) have caused the patch to stop
>>>>>> applying:
>>>>>> https://hudson.apache.org/hudson/view/S-Z/view/ZooKeeper/job/PreCommit-ZOOKEEPER-Build/238/console
>>>>>>
>>>>>> Ketan would you like to try updating the patch and resubmit?
>>>>>>
>>>>>> Patrick
>>>>>>
>>>>>> On Tue, May 3, 2011 at 3:31 PM, Ketan Gangatirkar <ketan@indeed.com>
wrote:
>>>>>>> Thanks, Mahadev.  I had seen ZOOKEEPER-892 but not ZOOKEEPER-784.
 The
>>>>>>> latter may be what we need.
>>>>>>>
>>>>>>> I read the comments attached to that issue.  The most recent
comment
>>>>>>> was a Hudson CI message indicating that the tests against the
patch
>>>>>>> failed.  I was not able to find out more as it appears that
the
>>>>>>> configuration of the Apache Hudson has changed.  It appears
that the
>>>>>>> patch was approved but not merged into trunk, and it's now in
limbo.
>>>>>>> What is necessary to get that feature into the next release?
 I may be
>>>>>>> able to assist, depending on what's involved.  Thank you.
>>>>>>>
>>>>>>>
>>>>>>> On Tue, May 3, 2011 at 4:17 PM, Mahadev Konar <mahadev@apache.org>
wrote:
>>>>>>>> Hi Ketan,
>>>>>>>>  You are correct that observers need connection to quorum
as well.
>>>>>>>> There have been quite a few discussions on multi colo replication
and
>>>>>>>> read only mode of ZooKeeper.
>>>>>>>>
>>>>>>>> Here are the jiras for those:
>>>>>>>>
>>>>>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-784
>>>>>>>> and
>>>>>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-892
>>>>>>>>
>>>>>>>> These have been mostly targeted at exactly a use case like
yours.
>>>>>>>> Please take a look and them and feel free to contribute/comment
on the
>>>>>>>> jiras.
>>>>>>>>
>>>>>>>> --
>>>>>>>> thanks
>>>>>>>> mahadev
>>>>>>>> @mahadevkonar
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, May 3, 2011 at 2:07 PM, Ketan Gangatirkar <ketan@indeed.com>
wrote:
>>>>>>>>> Hi.  We're considering ZooKeeper for coordinating operations
across
>>>>>>>>> multiple data centers.  These data centers will occasionally
be
>>>>>>>>> disconnected.  We were planning on using observers in
remote data
>>>>>>>>> centers.  Our applications can survive being unable
to *write* to
>>>>>>>>> ZooKeeper, but they do need to be able to read from it,
even if the
>>>>>>>>> data were stale.
>>>>>>>>>
>>>>>>>>> On further examination, it looks like observers must
always be
>>>>>>>>> connected to the quorum to function at all.  Is this
correct?  Does
>>>>>>>>> anyone have suggestions for how to work around this problem?
 The
>>>>>>>>> first thing that comes to mind is duplicating the required
data in
>>>>>>>>> some other local data store and falling back on that
when the DC
>>>>>>>>> becomes disconnected.  I imagine the disadvantages of
that are obvious
>>>>>>>>> to everyone.  I hope someone can share some great idea
that allows me
>>>>>>>>> to avoid that miserable fate.  Thanks.
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Ketan Gangatirkar
>>>>>>>>> ketan@indeed.com
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Ketan Gangatirkar
>>>>>>> ketan@indeed.com
>>>>>>> Perishable Developer
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Ketan Gangatirkar
>>>>> ketan@indeed.com
>>>>> Perishable Developer
>>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> Ketan Gangatirkar
>>> ketan@indeed.com
>>> Perishable Developer
>>>
>>
>
>
>
> --
> Ketan Gangatirkar
> ketan@indeed.com
> Perishable Developer
>

Mime
View raw message