zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benjamin Reed <br...@apache.org>
Subject Re: lost ZK events across datacenters
Date Wed, 08 Jun 2011 21:02:42 GMT
yes, the LogFormatter class will do it for me.

ben

On Wed, Jun 8, 2011 at 1:55 PM, Jun Rao <junrao@gmail.com> wrote:
> Ben,
>
> The log is binary. Is there a log reader? Also, can I just look at the log
> on any zookeeper server?
>
> Thanks,
>
> Jun
>
> On Fri, Jun 3, 2011 at 10:18 AM, Benjamin Reed <breed@apache.org> wrote:
>
>> actually, i think the transaction log could help a lot, and that will
>> always be there. two scenarios i can think of are:
>> 1) the change happened before the watch was set
>> 2) the change never got there
>> you could get an answer to both of those questions by looking at the
>> transaction log.
>>
>> ben
>>
>> On Fri, Jun 3, 2011 at 9:59 AM, Jun Rao <junrao@gmail.com> wrote:
>> > I don't expect that we can discover the problem right now. However, what
>> are
>> > the things that I can do to collect enough tracing should the problem
>> occur
>> > again in the future (e.g., is INFO level logging enough)?
>> >
>> > Thanks,
>> >
>> > Jun
>> >
>> > On Fri, Jun 3, 2011 at 9:56 AM, Jun Rao <junrao@gmail.com> wrote:
>> >
>> >> The log doesn't have any state changing entries around the time the
>> watcher
>> >> is triggered, in all clients.
>> >>
>> >> Jun
>> >>
>> >>
>> >> On Fri, Jun 3, 2011 at 9:32 AM, Fournier, Camille F. [Tech] <
>> >> Camille.Fournier@gs.com> wrote:
>> >>
>> >>> Any state changes for the problem client between setting the watch and
>> >>> when you expected it to get called? Do you have logs for that client
vs
>> the
>> >>> others that show anything?
>> >>>
>> >>> -----Original Message-----
>> >>> From: Jun Rao [mailto:junrao@gmail.com]
>> >>> Sent: Friday, June 03, 2011 4:40 AM
>> >>> To: user@zookeeper.apache.org
>> >>> Subject: Re: lost ZK events across datacenters
>> >>>
>> >>> Ben,
>> >>>
>> >>> Some details below.
>> >>>
>> >>> The call that sets the watcher simple calls getChildren with watcher
>> flag
>> >>> set to true. The triggering change is that one of the child nodes
>> (which
>> >>> is
>> >>> ephemeral) is deleted because the creating client is gone.
>> >>>
>> >>> Thanks,
>> >>>
>> >>> Jun
>> >>>
>> >>> On Thu, Jun 2, 2011 at 10:49 AM, Benjamin Reed <breed@apache.org>
>> wrote:
>> >>>
>> >>> > can you tell us a bit more about the scenario? what was the call
the
>> >>> > set the watch event? and what were the changes that caused the
event?
>> >>> >
>> >>> > thanx
>> >>> > ben
>> >>> >
>> >>> > On Wed, Jun 1, 2011 at 3:14 PM, Jun Rao <junrao@gmail.com>
wrote:
>> >>> > > All my clients were on different machines. 2 of them got the
>> watcher
>> >>> > fired
>> >>> > > about the same time. The third one never got the watcher triggered.
>> >>> > >
>> >>> > > Thanks,
>> >>> > >
>> >>> > > Jun
>> >>> > >
>> >>> > > On Wed, Jun 1, 2011 at 2:18 PM, Fournier, Camille F. [Tech]
<
>> >>> > > Camille.Fournier@gs.com> wrote:
>> >>> > >
>> >>> > >> All clients are in different processes?
>> >>> > >> I've used zkclient and haven't seen any problems, but
I haven't
>> >>> hammered
>> >>> > it
>> >>> > >> too hard yet. I took a long look at the code and didn't
see any
>> >>> errors
>> >>> > but
>> >>> > >> there could always be something very subtle.
>> >>> > >>
>> >>> > >> -----Original Message-----
>> >>> > >> From: Jun Rao [mailto:junrao@gmail.com]
>> >>> > >> Sent: Wednesday, June 01, 2011 4:09 PM
>> >>> > >> To: user@zookeeper.apache.org
>> >>> > >> Subject: Re: lost ZK events across datacenters
>> >>> > >>
>> >>> > >> I am using the zkclient package (
>> >>> > >> https://github.com/sgroschupf/zkclient.git).
>> >>> > >> The watcher code seems reasonable. Basically, each watcher
event
>> is
>> >>> > first
>> >>> > >> added to a queue. A separate event thread dequeues each
event and
>> >>> reads
>> >>> > the
>> >>> > >> children of a path (which re-registers the watcher) and
invokes
>> the
>> >>> > >> registered listener.
>> >>> > >>
>> >>> > >> Anybody knows any issues in zkclient?
>> >>> > >>
>> >>> > >> Thanks,
>> >>> > >>
>> >>> > >> Jun
>> >>> > >>
>> >>> > >> On Wed, Jun 1, 2011 at 12:04 PM, Ted Dunning <
>> ted.dunning@gmail.com>
>> >>> > >> wrote:
>> >>> > >>
>> >>> > >> > This is most commonly due, in my own history of programming
>> errors,
>> >>> to
>> >>> > >> > writing code that has a race window in it.  It is
conceivable
>> that
>> >>> > cross
>> >>> > >> > data-center operation would make such a race more
of a problem.
>> >>> > >> >
>> >>> > >> > Can you say a bit about your code?  Did you make
sure to use
>> >>> standard
>> >>> > >> > idioms
>> >>> > >> > as opposed to setting the watch in a different call
from reading
>> >>> the
>> >>> > >> data?
>> >>> > >> >
>> >>> > >> > On Wed, Jun 1, 2011 at 11:40 AM, Jun Rao <junrao@gmail.com>
>> wrote:
>> >>> > >> >
>> >>> > >> > > Hi,
>> >>> > >> > >
>> >>> > >> > > I have a setup where multiple ZK clients are
sitting in a
>> >>> different
>> >>> > >> > > datacenter from the ZK server. All clients registered
the same
>> >>> child
>> >>> > >> > > watcher
>> >>> > >> > > on a path. However, when the children of the
path changed, the
>> >>> > watcher
>> >>> > >> on
>> >>> > >> > 1
>> >>> > >> > > of the clients didn't fire. This seems to have
happened a
>> couple
>> >>> of
>> >>> > >> times
>> >>> > >> > > to
>> >>> > >> > > me. I am using ZK 3.3.3. Has anyone used ZK
in a cross
>> datacenter
>> >>> > setup
>> >>> > >> > and
>> >>> > >> > > seen problems like that before?
>> >>> > >> > >
>> >>> > >> > > Thanks,
>> >>> > >> > >
>> >>> > >> > > Jun
>> >>> > >> > >
>> >>> > >> >
>> >>> > >>
>> >>> > >
>> >>> >
>> >>>
>> >>
>> >>
>> >
>>
>

Mime
View raw message