kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gabriel Ibarra <gabriel.iba...@tallertechnologies.com>
Subject Re: Strange behavior when turn the system clock back
Date Thu, 11 Aug 2016 13:19:50 GMT
Thanks for answering, all help is welcome.

Yes, I tested without changing the clock and It works well.
Actually both consumer are running in different process,
so I think it is not the case that you mention.

I even tested this using two different Kafka clients,
using the java client and using librdkafka of edenhill (a c client),
and I got the same results.
That is why I think that the problem come from Kafka.

Gabriel


On Thu, Aug 11, 2016 at 2:20 AM, Gwen Shapira <gwen@confluent.io> wrote:

> I know it sounds silly, but did you check that your test setup works
> when you don't change the clock?
>
> This pattern can happen when two consumers somehow block each other
> (for example, one thread with two consumers) - so one waits for the
> other to join, but the other is blocked, so the first is timed out and
> then the second is unblocked and manages to join but now the first is
> blocked and so on...
>
> Gwen
>
> On Wed, Aug 10, 2016 at 10:29 AM, Gabriel Ibarra
> <gabriel.ibarra@tallertechnologies.com> wrote:
> > Hello guys, I am dealing with an issue when turn the system clock back
> > (either due to NTP or administrator action). I'm using
> kafka_2.11-0.10.0.0
> >
> > I follow the next steps.
> > - Start a consumer for TOPIC_NAME with group id GROUP_NAME. It will be
> > owner of all the partitions.
> > - Turn the system clock back. For instance 1 hour.
> > - Start a new consumer for TOPIC_NAME  using the same group id, it will
> > force a rebalance.
> >
> > After these actions the kafka server logs constantly the below
> > messages, and after
> > a while both consumers do not receive more packages. I saw that this
> > condition lasts at least the time that the clock went back, for this
> > example 1 hour, and finally after this time kafka come back to work.
> >
> > [2016-08-08 11:30:23,023] INFO [GroupCoordinator 0]: Preparing to
> > restabilize group GROUP_NAME with old generation 2 (kafka.coordinator.
> > GroupCoordinator)
> > [2016-08-08 11:30:23,025] INFO [GroupCoordinator 0]: Stabilized group
> > GROUP_NAME generation 3 (kafka.coordinator.GroupCoordinator)
> > [2016-08-08 11:30:23,027] INFO [GroupCoordinator 0]: Preparing to
> > restabilize group GROUP_NAME with old generation 3 (kafka.coordinator.
> > GroupCoordinator)
> > [2016-08-08 11:30:23,029] INFO [GroupCoordinator 0]: Group GROUP_NAME
> > generation 3 is dead and removed (kafka.coordinator.GroupCoordinator)
> > [2016-08-08 11:30:23,032] INFO [GroupCoordinator 0]: Preparing to
> > restabilize group GROUP_NAME with old generation 0 (kafka.coordinator.
> > GroupCoordinator)
> > [2016-08-08 11:30:23,032] INFO [GroupCoordinator 0]: Stabilized group
> > GROUP_NAME generation 1 (kafka.coordinator.GroupCoordinator)
> > [2016-08-08 11:30:23,033] INFO [GroupCoordinator 0]: Preparing to
> > restabilize group GROUP_NAME with old generation 1 (kafka.coordinator.
> > GroupCoordinator)
> > [2016-08-08 11:30:23,034] INFO [GroupCoordinator 0]: Group GROUP
> generation
> > 1 is dead and removed (kafka.coordinator.GroupCoordinator)
> > [2016-08-08 11:30:23,043] INFO [GroupCoordinator 0]: Preparing to
> > restabilize group GROUP_NAME with old generation 0 (kafka.coordinator.
> > GroupCoordinator)
> > [2016-08-08 11:30:23,044] INFO [GroupCoordinator 0]: Stabilized group
> > GROUP_NAME generation 1 (kafka.coordinator.GroupCoordinator)
> > [2016-08-08 11:30:23,044] INFO [GroupCoordinator 0]: Preparing to
> > restabilize group GROUP_NAME with old generation 1 (kafka.coordinator.
> > GroupCoordinator)
> > [2016-08-08 11:30:23,045] INFO [GroupCoordinator 0]: Group GROUP_NAME
> > generation 1 is dead and removed (kafka.coordinator.GroupCoordinator)
> >
> > IMHO, I think that kafka's consumers have to work fine after any change
> of
> > system clock, but maybe this behavior has fundamentals that I don't know.
> >
> > I'm sorry if it was discussed previously, I was researching but I didn't
> > found a similar issue.
> >
> > Thanks,
> >
> > --
> >
> >
> >
> > Gabriel Alejandro Ibarra
> >
> > Software Engineer
> >
> > San Lorenzo 47, 3rd Floor, Office 5
> >
> > Córdoba, Argentina
> >
> > Phone: +54 351 4217888
>
>
>
> --
> Gwen Shapira
> Product Manager | Confluent
> 650.450.2760 | @gwenshap
> Follow us: Twitter | blog
>



-- 



Gabriel Alejandro Ibarra

Software Engineer

San Lorenzo 47, 3rd Floor, Office 5

Córdoba, Argentina

Phone: +54 351 4217888

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message