samza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yan Fang <yanfang...@gmail.com>
Subject Re: Errors and hung job on broker shutdown
Date Wed, 29 Apr 2015 23:30:41 GMT
Not sure about the Kafka side. From the Samza side, from your
description ( "does
not exit nor does it make any progress" ), I think the code is stuck in
producer.close
<https://github.com/apache/samza/blob/master/samza-kafka/src/main/scala/org/apache/samza/system/kafka/KafkaSystemProducer.scala#L143>,
otherwise, it will throw SamzaException to quit the job. So maybe some
Kafka experts in this mailing list or Kafka mailing list can help

Fang, Yan
yanfang724@gmail.com

On Tue, Apr 28, 2015 at 5:35 PM, Roger Hoover <roger.hoover@gmail.com>
wrote:

> At error level logging, this was the only entry in the Samza log:
>
> 2015-04-28 14:28:25 KafkaSystemProducer [ERROR] task[Partition 2]
> ssp[kafka,svc.call.w_deploy.c7tH4YaiTQyBEwAAhQzRXw,2] offset[9129395]
> Unable to send message from TaskName-Partition 1 to system kafka
>
> Here is the log from the Kafka broker that was shutdown.
>
> http://pastebin.com/afgmLyNF
>
> Thanks,
>
> Roger
>
>
> On Tue, Apr 28, 2015 at 3:49 PM, Yi Pan <nickpan47@gmail.com> wrote:
>
> > Roger, could you paste the full log from Samza container? If you can
> figure
> > out which Kafka broker the message was sent to, it would be helpful if we
> > get the log from the broker as well.
> >
> > On Tue, Apr 28, 2015 at 3:31 PM, Roger Hoover <roger.hoover@gmail.com>
> > wrote:
> >
> > > Hi,
> > >
> > > I need some help figuring out what's going on.
> > >
> > > I'm running Kafka 0.8.2.1 and Samza 0.9.0 on YARN.  All the topics have
> > > replication factor of 2.
> > >
> > > I'm bouncing the Kafka broker using SIGTERM (with
> > > controlled.shutdown.enable=true).  I see the Samza job log this message
> > and
> > > then hang (does not exit nor does it make any progress).
> > >
> > > 2015-04-28 14:28:25 KafkaSystemProducer [ERROR] task[Partition 2]
> > > ssp[kafka,my-topic,2] offset[9129395] Unable to send message from
> > > TaskName-Partition 1 to system kafka
> > >
> > > The Kafka consumer (Druid Real-Time node) on the other side then barfs
> on
> > > the message:
> > >
> > > Exception in thread "chief-svc-perf"
> > kafka.message.InvalidMessageException:
> > > Message is corrupt (stored crc = 1792882425, computed crc = 3898271689)
> > > at kafka.message.Message.ensureValid(Message.scala:166)
> > > at kafka.consumer.ConsumerIterator.makeNext(ConsumerIterator.scala:101)
> > > at kafka.consumer.ConsumerIterator.makeNext(ConsumerIterator.scala:33)
> > > at
> > kafka.utils.IteratorTemplate.maybeComputeNext(IteratorTemplate.scala:66)
> > > at kafka.utils.IteratorTemplate.hasNext(IteratorTemplate.scala:58)
> > > at
> > >
> > >
> >
> io.druid.firehose.kafka.KafkaEightFirehoseFactory$1.hasMore(KafkaEightFirehoseFactory.java:106)
> > > at
> > >
> > >
> >
> io.druid.segment.realtime.RealtimeManager$FireChief.run(RealtimeManager.java:234)
> > >
> > > My questions are:
> > > 1) What is the right way to bounce a Kafka broker?
> > > 2) Is this a bug in Samza that the job hangs after producer request
> > fails?
> > > Has anyone seen this?
> > >
> > > Thanks,
> > >
> > > Roger
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message