flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Metzger <rmetz...@apache.org>
Subject Re: Checkpoint
Date Thu, 10 Mar 2016 10:23:55 GMT
Hi Vijay,

regarding your other questions:

1) On the TaskManagers, the FlinkKafkaConsumers will write the partitions
they are going to read in the log. There is currently no way of seeing the
state of a checkpoint in Flink (which is the offsets).
However, once a checkpoint is completed, the Kafka consumer is committing
the offset to the Kafka broker. (I could not find tool to get the committed
offsets from the broker, but its either stored in ZK or in a special topic
by the broker. In Kafka 0.8 that's easily doable with the
kafka.tools.ConsumerOffsetChecker)

2) Do you see duplicate data written by the rolling file sink? Or do you
see it somewhere else?
HDP 2.4 is using Hadoop 2.7.1 so the truncate() of invalid data should
actually work properly.




On Thu, Mar 10, 2016 at 10:44 AM, Ufuk Celebi <uce@apache.org> wrote:

> How many vertices does the web interface show and what parallelism are
> you running? If the sleeping operator is chained you will not see
> anything.
>
> If your goal is to just see some back pressure warning, you can call
> env.disableOperatorChaining() and re-run the program. Does this work?
>
> – Ufuk
>
>
> On Thu, Mar 10, 2016 at 1:35 AM, Vijay Srinivasaraghavan
> <vijikarthi@yahoo.com> wrote:
> > Hi Ufuk,
> >
> > I have increased the sampling size to 1000 and decreased the refresh
> > interval by half. In my Kafka topic I have pumped million messages which
> is
> > read by KafkaConsumer pipeline and then pass it to a transofmation step
> > where I have introduced sleep (3 sec) for every single message received
> and
> > the final step is HDFS sink using RollingSinc API.
> >
> > jobmanager.web.backpressure.num-samples: 1000
> > jobmanager.web.backpressure.refresh-interval: 30000
> >
> >
> > I was hoping to see the backpressure tab from UI to display some warning
> but
> > I still see "OK" message.
> >
> > This makes me wonder if I am testing the backpressure scenario properly
> or
> > not?
> >
> > Regards
> > Vijay
> >
> > On Monday, March 7, 2016 3:19 PM, Ufuk Celebi <uce@apache.org> wrote:
> >
> >
> > Hey Vijay!
> >
> > On Mon, Mar 7, 2016 at 8:42 PM, Vijay Srinivasaraghavan
> > <vijikarthi@yahoo.com> wrote:
> >> 3) How can I simulate and verify backpressure? I have introduced some
> >> delay
> >> (Thread Sleep) in the job before the sink but the "backpressure" tab
> from
> >> UI
> >> does not show any indication of whether backpressure is working or not.
> >
> > If a task is slow, it is back pressuring upstream tasks, e.g. if your
> > transformations have the sleep, the sources should be back pressured.
> > It can happen that even with the sleep the tasks still produce their
> > data as fast as they can and hence no back pressure is indicated in
> > the web interface. You can increase the sleep to check this.
> >
> > The mechanism used to determine back pressure is based on sampling the
> > stack traces of running tasks. You can increase the number of samples
> > and/or decrease the delay between samples via config parameters shown
> > in [1]. It can happen that the samples miss the back pressure
> > indicators, but usually the defaults work fine.
> >
> >
> > [1]
> >
> https://ci.apache.org/projects/flink/flink-docs-master/setup/config.html#jobmanager-web-frontend
> >
> >
> >
>

Mime
View raw message