zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jun Rao <jun...@gmail.com>
Subject Re: lost ZK events across datacenters
Date Fri, 10 Jun 2011 06:02:14 GMT
Hmm, those logs are pretty big, there is a 67MB file per hour.

Jun

On Wed, Jun 8, 2011 at 2:02 PM, Benjamin Reed <breed@apache.org> wrote:

> yes, the LogFormatter class will do it for me.
>
> ben
>
> On Wed, Jun 8, 2011 at 1:55 PM, Jun Rao <junrao@gmail.com> wrote:
> > Ben,
> >
> > The log is binary. Is there a log reader? Also, can I just look at the
> log
> > on any zookeeper server?
> >
> > Thanks,
> >
> > Jun
> >
> > On Fri, Jun 3, 2011 at 10:18 AM, Benjamin Reed <breed@apache.org> wrote:
> >
> >> actually, i think the transaction log could help a lot, and that will
> >> always be there. two scenarios i can think of are:
> >> 1) the change happened before the watch was set
> >> 2) the change never got there
> >> you could get an answer to both of those questions by looking at the
> >> transaction log.
> >>
> >> ben
> >>
> >> On Fri, Jun 3, 2011 at 9:59 AM, Jun Rao <junrao@gmail.com> wrote:
> >> > I don't expect that we can discover the problem right now. However,
> what
> >> are
> >> > the things that I can do to collect enough tracing should the problem
> >> occur
> >> > again in the future (e.g., is INFO level logging enough)?
> >> >
> >> > Thanks,
> >> >
> >> > Jun
> >> >
> >> > On Fri, Jun 3, 2011 at 9:56 AM, Jun Rao <junrao@gmail.com> wrote:
> >> >
> >> >> The log doesn't have any state changing entries around the time the
> >> watcher
> >> >> is triggered, in all clients.
> >> >>
> >> >> Jun
> >> >>
> >> >>
> >> >> On Fri, Jun 3, 2011 at 9:32 AM, Fournier, Camille F. [Tech] <
> >> >> Camille.Fournier@gs.com> wrote:
> >> >>
> >> >>> Any state changes for the problem client between setting the watch
> and
> >> >>> when you expected it to get called? Do you have logs for that client
> vs
> >> the
> >> >>> others that show anything?
> >> >>>
> >> >>> -----Original Message-----
> >> >>> From: Jun Rao [mailto:junrao@gmail.com]
> >> >>> Sent: Friday, June 03, 2011 4:40 AM
> >> >>> To: user@zookeeper.apache.org
> >> >>> Subject: Re: lost ZK events across datacenters
> >> >>>
> >> >>> Ben,
> >> >>>
> >> >>> Some details below.
> >> >>>
> >> >>> The call that sets the watcher simple calls getChildren with watcher
> >> flag
> >> >>> set to true. The triggering change is that one of the child nodes
> >> (which
> >> >>> is
> >> >>> ephemeral) is deleted because the creating client is gone.
> >> >>>
> >> >>> Thanks,
> >> >>>
> >> >>> Jun
> >> >>>
> >> >>> On Thu, Jun 2, 2011 at 10:49 AM, Benjamin Reed <breed@apache.org>
> >> wrote:
> >> >>>
> >> >>> > can you tell us a bit more about the scenario? what was the
call
> the
> >> >>> > set the watch event? and what were the changes that caused
the
> event?
> >> >>> >
> >> >>> > thanx
> >> >>> > ben
> >> >>> >
> >> >>> > On Wed, Jun 1, 2011 at 3:14 PM, Jun Rao <junrao@gmail.com>
wrote:
> >> >>> > > All my clients were on different machines. 2 of them
got the
> >> watcher
> >> >>> > fired
> >> >>> > > about the same time. The third one never got the watcher
> triggered.
> >> >>> > >
> >> >>> > > Thanks,
> >> >>> > >
> >> >>> > > Jun
> >> >>> > >
> >> >>> > > On Wed, Jun 1, 2011 at 2:18 PM, Fournier, Camille F.
[Tech] <
> >> >>> > > Camille.Fournier@gs.com> wrote:
> >> >>> > >
> >> >>> > >> All clients are in different processes?
> >> >>> > >> I've used zkclient and haven't seen any problems,
but I haven't
> >> >>> hammered
> >> >>> > it
> >> >>> > >> too hard yet. I took a long look at the code and
didn't see any
> >> >>> errors
> >> >>> > but
> >> >>> > >> there could always be something very subtle.
> >> >>> > >>
> >> >>> > >> -----Original Message-----
> >> >>> > >> From: Jun Rao [mailto:junrao@gmail.com]
> >> >>> > >> Sent: Wednesday, June 01, 2011 4:09 PM
> >> >>> > >> To: user@zookeeper.apache.org
> >> >>> > >> Subject: Re: lost ZK events across datacenters
> >> >>> > >>
> >> >>> > >> I am using the zkclient package (
> >> >>> > >> https://github.com/sgroschupf/zkclient.git).
> >> >>> > >> The watcher code seems reasonable. Basically, each
watcher
> event
> >> is
> >> >>> > first
> >> >>> > >> added to a queue. A separate event thread dequeues
each event
> and
> >> >>> reads
> >> >>> > the
> >> >>> > >> children of a path (which re-registers the watcher)
and invokes
> >> the
> >> >>> > >> registered listener.
> >> >>> > >>
> >> >>> > >> Anybody knows any issues in zkclient?
> >> >>> > >>
> >> >>> > >> Thanks,
> >> >>> > >>
> >> >>> > >> Jun
> >> >>> > >>
> >> >>> > >> On Wed, Jun 1, 2011 at 12:04 PM, Ted Dunning <
> >> ted.dunning@gmail.com>
> >> >>> > >> wrote:
> >> >>> > >>
> >> >>> > >> > This is most commonly due, in my own history
of programming
> >> errors,
> >> >>> to
> >> >>> > >> > writing code that has a race window in it. 
It is conceivable
> >> that
> >> >>> > cross
> >> >>> > >> > data-center operation would make such a race
more of a
> problem.
> >> >>> > >> >
> >> >>> > >> > Can you say a bit about your code?  Did you
make sure to use
> >> >>> standard
> >> >>> > >> > idioms
> >> >>> > >> > as opposed to setting the watch in a different
call from
> reading
> >> >>> the
> >> >>> > >> data?
> >> >>> > >> >
> >> >>> > >> > On Wed, Jun 1, 2011 at 11:40 AM, Jun Rao <junrao@gmail.com>
> >> wrote:
> >> >>> > >> >
> >> >>> > >> > > Hi,
> >> >>> > >> > >
> >> >>> > >> > > I have a setup where multiple ZK clients
are sitting in a
> >> >>> different
> >> >>> > >> > > datacenter from the ZK server. All clients
registered the
> same
> >> >>> child
> >> >>> > >> > > watcher
> >> >>> > >> > > on a path. However, when the children of
the path changed,
> the
> >> >>> > watcher
> >> >>> > >> on
> >> >>> > >> > 1
> >> >>> > >> > > of the clients didn't fire. This seems
to have happened a
> >> couple
> >> >>> of
> >> >>> > >> times
> >> >>> > >> > > to
> >> >>> > >> > > me. I am using ZK 3.3.3. Has anyone used
ZK in a cross
> >> datacenter
> >> >>> > setup
> >> >>> > >> > and
> >> >>> > >> > > seen problems like that before?
> >> >>> > >> > >
> >> >>> > >> > > Thanks,
> >> >>> > >> > >
> >> >>> > >> > > Jun
> >> >>> > >> > >
> >> >>> > >> >
> >> >>> > >>
> >> >>> > >
> >> >>> >
> >> >>>
> >> >>
> >> >>
> >> >
> >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message