zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Hunt <ph...@apache.org>
Subject Re: observers in occasionally disconnected data centers
Date Wed, 04 May 2011 17:53:29 GMT
This is odd, it's failing in the c tests but for a weird reason:

in:
https://builds.apache.org/hudson/job/PreCommit-ZOOKEEPER-Build/247/artifact/trunk/build/tmp/zk.log

it says:
/grid/0/hudson/hudson-slave/workspace/PreCommit-ZOOKEEPER-Build/trunk/src/c/tests/zkServer.sh:
line 115: java: command not found

I'll ping the hudson admins and see if this is a known issue (also
hudson is very slow today for some reason).

Once that's addressed we should be good to go.

Patrick

On Wed, May 4, 2011 at 9:57 AM, Ketan Gangatirkar <ketan@indeed.com> wrote:
> Got the patch formatted right and applying successfully, now I'll see
> if I can figure out the unit test failure.
>
> On Wed, May 4, 2011 at 11:26 AM, Patrick Hunt <phunt@apache.org> wrote:
>> Hi Ketan, the patch is failing to apply
>> https://builds.apache.org/hudson/job/PreCommit-ZOOKEEPER-Build/246//console
>>
>> Looks like you used git, I usually do something like:
>> git diff rev1..rev2 --no-prefix > ZOOKEEPER-784.patch
>> can you give it another try?
>>
>> Patrick
>>
>> On Tue, May 3, 2011 at 6:42 PM, Ketan Gangatirkar <ketan@indeed.com> wrote:
>>> I have updated Sergey's patch to:
>>>
>>> * apply to current trunk
>>> * incorporate one trivial output change he made to StatCommand in
>>> NettyServerCnxn.java
>>> * change log4j references to slf4j
>>>
>>> I have successfully run ant releaseaudit on the result.  The updated
>>> patch is now attached to the issue:
>>>
>>> https://issues.apache.org/jira/browse/ZOOKEEPER-784
>>>
>>> I do *not* make any claim to have understood the contents of this
>>> patch; all I did was synch everything and fix the obvious log4j/slf4j
>>> change.  Now what?
>>>
>>>
>>> On Tue, May 3, 2011 at 5:46 PM, Patrick Hunt <phunt@apache.org> wrote:
>>>> The core tests failed on last hudson, I just kicked off a patch build,
>>>> seems recent changes (logging?) have caused the patch to stop
>>>> applying:
>>>> https://hudson.apache.org/hudson/view/S-Z/view/ZooKeeper/job/PreCommit-ZOOKEEPER-Build/238/console
>>>>
>>>> Ketan would you like to try updating the patch and resubmit?
>>>>
>>>> Patrick
>>>>
>>>> On Tue, May 3, 2011 at 3:31 PM, Ketan Gangatirkar <ketan@indeed.com>
wrote:
>>>>> Thanks, Mahadev.  I had seen ZOOKEEPER-892 but not ZOOKEEPER-784.  The
>>>>> latter may be what we need.
>>>>>
>>>>> I read the comments attached to that issue.  The most recent comment
>>>>> was a Hudson CI message indicating that the tests against the patch
>>>>> failed.  I was not able to find out more as it appears that the
>>>>> configuration of the Apache Hudson has changed.  It appears that the
>>>>> patch was approved but not merged into trunk, and it's now in limbo.
>>>>> What is necessary to get that feature into the next release?  I may
be
>>>>> able to assist, depending on what's involved.  Thank you.
>>>>>
>>>>>
>>>>> On Tue, May 3, 2011 at 4:17 PM, Mahadev Konar <mahadev@apache.org>
wrote:
>>>>>> Hi Ketan,
>>>>>>  You are correct that observers need connection to quorum as well.
>>>>>> There have been quite a few discussions on multi colo replication
and
>>>>>> read only mode of ZooKeeper.
>>>>>>
>>>>>> Here are the jiras for those:
>>>>>>
>>>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-784
>>>>>> and
>>>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-892
>>>>>>
>>>>>> These have been mostly targeted at exactly a use case like yours.
>>>>>> Please take a look and them and feel free to contribute/comment on
the
>>>>>> jiras.
>>>>>>
>>>>>> --
>>>>>> thanks
>>>>>> mahadev
>>>>>> @mahadevkonar
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, May 3, 2011 at 2:07 PM, Ketan Gangatirkar <ketan@indeed.com>
wrote:
>>>>>>> Hi.  We're considering ZooKeeper for coordinating operations
across
>>>>>>> multiple data centers.  These data centers will occasionally
be
>>>>>>> disconnected.  We were planning on using observers in remote
data
>>>>>>> centers.  Our applications can survive being unable to *write*
to
>>>>>>> ZooKeeper, but they do need to be able to read from it, even
if the
>>>>>>> data were stale.
>>>>>>>
>>>>>>> On further examination, it looks like observers must always be
>>>>>>> connected to the quorum to function at all.  Is this correct?
 Does
>>>>>>> anyone have suggestions for how to work around this problem?
 The
>>>>>>> first thing that comes to mind is duplicating the required data
in
>>>>>>> some other local data store and falling back on that when the
DC
>>>>>>> becomes disconnected.  I imagine the disadvantages of that are
obvious
>>>>>>> to everyone.  I hope someone can share some great idea that
allows me
>>>>>>> to avoid that miserable fate.  Thanks.
>>>>>>>
>>>>>>> --
>>>>>>> Ketan Gangatirkar
>>>>>>> ketan@indeed.com
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Ketan Gangatirkar
>>>>> ketan@indeed.com
>>>>> Perishable Developer
>>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> Ketan Gangatirkar
>>> ketan@indeed.com
>>> Perishable Developer
>>>
>>
>
>
>
> --
> Ketan Gangatirkar
> ketan@indeed.com
> Perishable Developer
>

Mime
View raw message