flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Flavio Pompermaier <pomperma...@okkam.it>
Subject Re: [DISCUSS] Release Flink 1.1.5 / Flink 1.2.1
Date Fri, 17 Mar 2017 15:22:55 GMT
I propose to fix https://issues.apache.org/jira/browse/FLINK-6103 before
issue a release

On Fri, Mar 17, 2017 at 8:12 AM, Ufuk Celebi <uce@apache.org> wrote:

> Cool! Thanks for taking care of this Gordon :-)
>
> On Fri, Mar 17, 2017 at 7:13 AM, Tzu-Li (Gordon) Tai
> <tzulitai@apache.org> wrote:
> > Update for 1.1.5:
> > The last fixes for 1.1.5 are in! I will create the RC today and start
> the vote.
> >
> > Cheers,
> > Gordon
> >
> >
> > On March 17, 2017 at 1:14:53 AM, Robert Metzger (rmetzger@apache.org)
> wrote:
> >
> > The cassandra connector is probably not usable in Flink 1.2.0. I would
> like
> > to include a fix in 1.2.1:
> > https://issues.apache.org/jira/browse/FLINK-6084
> >
> > Please let me know if this fix becomes a blocker for the 1.2.1 release.
> If
> > so, I can validate the fix myself to speed up things.
> >
> > On Thu, Mar 16, 2017 at 9:41 AM, Jinkui Shi <shijinkui666@163.com>
> wrote:
> >
> >> @Tzu-li(Fordon)Tai
> >>
> >> FLINK-5650 is fix by [1]. Chesnay Scheduler push a PR please.
> >>
> >> [1] https://github.com/zentol/flink/tree/5650_python_test_debug <
> >> https://github.com/zentol/flink/tree/5650_python_test_debug>
> >>
> >>
> >> > 在 2017年3月16日,上午3:37,Stephan Ewen <sewen@apache.org>
写道:
> >> >
> >> > Thanks for the update!
> >> >
> >> > Just merged to 1.2.1 also: [FLINK-5962] [checkpoints] Remove scheduled
> >> > cancel-task from timer queue to prevent memory leaks
> >> >
> >> > The remaining issue list looks good, but I would say that (5) is
> >> optional.
> >> > It is not a critical production bug.
> >> >
> >> >
> >> >
> >> > On Wed, Mar 15, 2017 at 5:38 PM, Tzu-Li (Gordon) Tai <
> >> tzulitai@apache.org>
> >> > wrote:
> >> >
> >> >> Thanks a lot for the updates so far everyone!
> >> >>
> >> >> From the discussion so far, the below is the still unfixed pending
> >> issues
> >> >> for 1.1.5 / 1.2.1 release.
> >> >>
> >> >> Since there’s only one backport for 1.1.5 left, I think having an
RC
> for
> >> >> 1.1.5 near the end of this week / early next week is very promising,
> as
> >> >> basically everything is already in.
> >> >> I’d be happy to volunteer to help manage the release for 1.1.5, and
> >> >> prepare the RC when it’s ready :)
> >> >>
> >> >> For 1.2.1, we can leave the pending list here for tracking, and come
> >> back
> >> >> to update it in the near future.
> >> >>
> >> >> If there’s anything I missed, please let me know!
> >> >>
> >> >>
> >> >> =========== Still pending for Flink 1.1.5 ===========
> >> >>
> >> >> (1) https://issues.apache.org/jira/browse/FLINK-5701
> >> >> Broken at-least-once Kafka producer.
> >> >> Status: backport PR pending - https://github.com/apache/
> flink/pull/3549
> >> .
> >> >> Since it is a relatively self-contained change, I expect this to be
a
> >> fast
> >> >> fix.
> >> >>
> >> >>
> >> >>
> >> >> =========== Still pending for Flink 1.2.1 ===========
> >> >>
> >> >> (1) https://issues.apache.org/jira/browse/FLINK-5808
> >> >> Fix Missing verification for setParallelism and setMaxParallelism
> >> >> Status: PR - https://github.com/apache/flink/pull/3509, review in
> >> progress
> >> >>
> >> >> (2) https://issues.apache.org/jira/browse/FLINK-5713
> >> >> Protect against NPE in WindowOperator window cleanup
> >> >> Status: PR - https://github.com/apache/flink/pull/3535, review
> pending
> >> >>
> >> >> (3) https://issues.apache.org/jira/browse/FLINK-6044
> >> >> TypeSerializerSerializationProxy.read() doesn't verify the read
> buffer
> >> >> length
> >> >> Status: Fixed for master, 1.2 backport pending
> >> >>
> >> >> (4) https://issues.apache.org/jira/browse/FLINK-5985
> >> >> Flink treats every task as stateful (making topology changes
> impossible)
> >> >> Status: PR - https://github.com/apache/flink/pull/3543, review in
> >> progress
> >> >>
> >> >> (5) https://issues.apache.org/jira/browse/FLINK-5650
> >> >> Flink-python tests taking up too much time
> >> >> Status: I think Chesnay currently has some progress with this one,
we
> >> can
> >> >> see if we want to make this a blocker
> >> >>
> >> >>
> >> >> Cheers,
> >> >> Gordon
> >> >>
> >> >> On March 15, 2017 at 7:16:53 PM, Jinkui Shi (shijinkui666@163.com)
> >> wrote:
> >> >>
> >> >> Can we fix this issue in the 1.2.1:
> >> >>
> >> >> Flink-python tests cost too long time
> >> >> https://issues.apache.org/jira/browse/FLINK-5650 <
> >> >> https://issues.apache.org/jira/browse/FLINK-5650>
> >> >>
> >> >>> 在 2017年3月15日,下午6:29,Vladislav Pernin <vladislav.pernin@gmail.com>
> 写道:
> >> >>>
> >> >>> I just tested in in my reproducer. It works.
> >> >>>
> >> >>> 2017-03-15 11:22 GMT+01:00 Aljoscha Krettek <aljoscha@apache.org>:
> >> >>>
> >> >>>> I did in fact just open a PR for
> >> >>>>> https://issues.apache.org/jira/browse/FLINK-6001
> >> >>>>> NPE on TumblingEventTimeWindows with ContinuousEventTimeTrigger
> and
> >> >>>>> allowedLateness
> >> >>>>
> >> >>>>
> >> >>>> On Tue, Mar 14, 2017, at 18:20, Vladislav Pernin wrote:
> >> >>>>> Hi,
> >> >>>>>
> >> >>>>> I would also include the following (not yet resolved) issue
in the
> >> >> 1.2.1
> >> >>>>> scope :
> >> >>>>>
> >> >>>>> https://issues.apache.org/jira/browse/FLINK-6001
> >> >>>>> NPE on TumblingEventTimeWindows with ContinuousEventTimeTrigger
> and
> >> >>>>> allowedLateness
> >> >>>>>
> >> >>>>> 2017-03-14 17:34 GMT+01:00 Ufuk Celebi <uce@apache.org>:
> >> >>>>>
> >> >>>>>> Big +1 Gordon!
> >> >>>>>>
> >> >>>>>> I think (10) is very critical to have in 1.2.1.
> >> >>>>>>
> >> >>>>>> – Ufuk
> >> >>>>>>
> >> >>>>>>
> >> >>>>>> On Tue, Mar 14, 2017 at 3:37 PM, Stefan Richter
> >> >>>>>> <s.richter@data-artisans.com> wrote:
> >> >>>>>>> Hi,
> >> >>>>>>>
> >> >>>>>>> I would suggest to also include in 1.2.1:
> >> >>>>>>>
> >> >>>>>>> (9) https://issues.apache.org/jira/browse/FLINK-6044
<
> >> >>>>>> https://issues.apache.org/jira/browse/FLINK-6044>
> >> >>>>>>> Replaces unintentional calls to InputStream#read(…)
with the
> >> intended
> >> >>>>>>> and correct InputStream#readFully(…)
> >> >>>>>>> Status: PR
> >> >>>>>>>
> >> >>>>>>> (10) https://issues.apache.org/jira/browse/FLINK-5985
<
> >> >>>>>> https://issues.apache.org/jira/browse/FLINK-5985>
> >> >>>>>>> Flink 1.2 was creating state handles for stateless
tasks which
> >> caused
> >> >>>>>> trouble
> >> >>>>>>> at restore time for users that wanted to do some
changes that
> only
> >> >>>>>> include
> >> >>>>>>> stateless operators to their topology.
> >> >>>>>>> Status: PR
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>>> Am 14.03.2017 um 15:15 schrieb Till Rohrmann
<
> >> trohrmann@apache.org
> >> >>>>> :
> >> >>>>>>>>
> >> >>>>>>>> Thanks for kicking off the discussion Tzu-Li.
I'd like to add
> the
> >> >>>>>> following
> >> >>>>>>>> issues which have already been merged into
the 1.2-release and
> >> >>>>>> 1.1-release
> >> >>>>>>>> branch:
> >> >>>>>>>>
> >> >>>>>>>> 1.2.1:
> >> >>>>>>>>
> >> >>>>>>>> (7) https://issues.apache.org/jira/browse/FLINK-5942
> >> >>>>>>>> Hardens the checkpoint recovery in case of
corrupted ZooKeeper
> >> data.
> >> >>>>>>>> Corrupted checkpoints will now be skipped.
> >> >>>>>>>> Status: Merged
> >> >>>>>>>>
> >> >>>>>>>> (8) https://issues.apache.org/jira/browse/FLINK-5940
> >> >>>>>>>> Hardens the checkpoint recovery in case that
we cannot retrieve
> >> the
> >> >>>>>>>> completed checkpoint from the meta data state
handle retrieved
> >> from
> >> >>>>>>>> ZooKeeper. This can, for example, happen if
the meta data is
> >> >>>> deleted.
> >> >>>>>>>> Checkpoints with unretrievable state handles
are skipped.
> >> >>>>>>>> Status: Merged
> >> >>>>>>>>
> >> >>>>>>>> 1.1.5:
> >> >>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>> (7) https://issues.apache.org/jira/browse/FLINK-5942
> >> >>>>>>>> Hardens the checkpoint recovery in case of
corrupted ZooKeeper
> >> data.
> >> >>>>>>>> Corrupted checkpoints will now be skipped.
> >> >>>>>>>> Status: Merged
> >> >>>>>>>>
> >> >>>>>>>> (8) https://issues.apache.org/jira/browse/FLINK-5940
> >> >>>>>>>> Hardens the checkpoint recovery in case that
we cannot retrieve
> >> the
> >> >>>>>>>> completed checkpoint from the meta data state
handle retrieved
> >> from
> >> >>>>>>>> ZooKeeper. This can, for example, happen if
the meta data is
> >> >>>> deleted.
> >> >>>>>>>> Checkpoints with unretrievable state handles
are skipped.
> >> >>>>>>>> Status: Merged
> >> >>>>>>>>
> >> >>>>>>>> Cheers,
> >> >>>>>>>> Till
> >> >>>>>>>>
> >> >>>>>>>> On Tue, Mar 14, 2017 at 12:02 PM, Tzu-Li (Gordon)
Tai <
> >> >>>>>> tzulitai@apache.org>
> >> >>>>>>>> wrote:
> >> >>>>>>>>
> >> >>>>>>>>> Hi all!
> >> >>>>>>>>>
> >> >>>>>>>>> I would like to start a discussion for
the next bugfix release
> >> for
> >> >>>>>> 1.1.x
> >> >>>>>>>>> and 1.2.x.
> >> >>>>>>>>> There’s been quite a few critical fixes
for bugs in both the
> >> >>>> releases
> >> >>>>>>>>> recently, and I think they deserve a bugfix
release soon.
> >> >>>>>>>>> Most of the bugs were reported by users.
> >> >>>>>>>>>
> >> >>>>>>>>> I’m starting the discussion for both
bugfix releases because
> most
> >> >>>> fixes
> >> >>>>>>>>> span both releases (almost identical).
> >> >>>>>>>>> Of course, the actual RC votes and RC creation
process doesn’t
> >> >>>> have to
> >> >>>>>> be
> >> >>>>>>>>> started together.
> >> >>>>>>>>>
> >> >>>>>>>>> Here’s an overview of what’s been collected
so far, for both
> >> bugfix
> >> >>>>>>>>> releases -
> >> >>>>>>>>> (it’s a list of what I’m aware of so
far, and may be missing
> >> stuff;
> >> >>>>>> please
> >> >>>>>>>>> append and bring to attention as necessary
:-) )
> >> >>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>> For Flink 1.2.1:
> >> >>>>>>>>>
> >> >>>>>>>>> (1) https://issues.apache.org/jira/browse/FLINK-5701:
> >> >>>>>>>>> Async exceptions in the FlinkKafkaProducer
are not checked on
> >> >>>>>> checkpoints.
> >> >>>>>>>>> This compromises the producer’s at-least-once
guarantee.
> >> >>>>>>>>> Status: merged
> >> >>>>>>>>>
> >> >>>>>>>>> (2) https://issues.apache.org/jira/browse/FLINK-5949:
> >> >>>>>>>>> Do not check Kerberos credentials for non-Kerberos
> >> authentications.
> >> >>>>>> MapR
> >> >>>>>>>>> users are affected by this, and cannot
submit Flink on YARN
> jobs
> >> >>>> on a
> >> >>>>>>>>> secured MapR cluster.
> >> >>>>>>>>> Status: PR - https://github.com/apache/flink/pull/3528,
one
> +1
> >> >>>> already
> >> >>>>>>>>>
> >> >>>>>>>>> (3) https://issues.apache.org/jira/browse/FLINK-6006:
> >> >>>>>>>>> Kafka Consumer can lose state if queried
partition list is
> >> >>>> incomplete
> >> >>>>>> on
> >> >>>>>>>>> restore.
> >> >>>>>>>>> Status: PR - https://github.com/apache/flink/pull/3505,
one
> +1
> >> >>>> already
> >> >>>>>>>>>
> >> >>>>>>>>> (4) https://issues.apache.org/jira/browse/FLINK-6025:
> >> >>>>>>>>> KryoSerializer may use the wrong classloader
when Kryo’s
> >> >>>>>> JavaSerializer is
> >> >>>>>>>>> used.
> >> >>>>>>>>> Status: merged
> >> >>>>>>>>>
> >> >>>>>>>>> (5) https://issues.apache.org/jira/browse/FLINK-5771:
> >> >>>>>>>>> Fix multi-char delimiters in Batch InputFormats.
> >> >>>>>>>>> Status: merged
> >> >>>>>>>>>
> >> >>>>>>>>> (6) https://issues.apache.org/jira/browse/FLINK-5934:
> >> >>>>>>>>> Set the Scheduler in the ExecutionGraph
via its constructor.
> This
> >> >>>>>> fixes a
> >> >>>>>>>>> bug that causes HA recovery to fail.
> >> >>>>>>>>> Status: merged
> >> >>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>> For Flink 1.1.5:
> >> >>>>>>>>>
> >> >>>>>>>>> (1) https://issues.apache.org/jira/browse/FLINK-5701:
> >> >>>>>>>>> Async exceptions in the FlinkKafkaProducer
are not checked on
> >> >>>>>> checkpoints.
> >> >>>>>>>>> This compromises the producer’s at-least-once
guarantee.
> >> >>>>>>>>> Status: This is already merged for 1.2.1.
I would personally
> like
> >> >>>> to
> >> >>>>>>>>> backport the fix for this to 1.1.5 also.
> >> >>>>>>>>>
> >> >>>>>>>>> (2) https://issues.apache.org/jira/browse/FLINK-6006:
> >> >>>>>>>>> Kafka Consumer can lose state if queried
partition list is
> >> >>>> incomplete
> >> >>>>>> on
> >> >>>>>>>>> restore.
> >> >>>>>>>>> Status: PR - https://github.com/apache/flink/pull/3507,
one
> +1
> >> >>>> already
> >> >>>>>>>>>
> >> >>>>>>>>> (3) https://issues.apache.org/jira/browse/FLINK-6025:
> >> >>>>>>>>> KryoSerializer may use the wrong classloader
when Kryo’s
> >> >>>>>> JavaSerializer is
> >> >>>>>>>>> used.
> >> >>>>>>>>> Status: merged
> >> >>>>>>>>>
> >> >>>>>>>>> (4) https://issues.apache.org/jira/browse/FLINK-5771:
> >> >>>>>>>>> Fix multi-char delimiters in Batch InputFormats.
> >> >>>>>>>>> Status: merged
> >> >>>>>>>>>
> >> >>>>>>>>> (5) https://issues.apache.org/jira/browse/FLINK-5934:
> >> >>>>>>>>> Set the Scheduler in the ExecutionGraph
via its constructor.
> This
> >> >>>>>> fixes a
> >> >>>>>>>>> bug that causes HA recovery to fail.
> >> >>>>>>>>> Status: merged
> >> >>>>>>>>>
> >> >>>>>>>>> (6) https://issues.apache.org/jira/browse/FLINK-5048:
> >> >>>>>>>>> Kafka Consumer (0.9/0.10) threading model
leads problematic
> >> >>>>>> cancellation
> >> >>>>>>>>> behavior.
> >> >>>>>>>>> Status: This fix was already released in
1.2.0, but never
> made it
> >> >>>> into
> >> >>>>>> the
> >> >>>>>>>>> 1.1.x bugfixes. Do we want to backport
this also for 1.1.5?
> >> >>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>> What do you think? From the list so far,
we pretty much
> already
> >> >>>> have
> >> >>>>>>>>> everything in, so I think it would be nice
to aim for RCs by
> the
> >> >>>> end of
> >> >>>>>>>>> this week.
> >> >>>>>>>>> Since both bugfix releases cover almost
the same list of
> issues,
> >> I
> >> >>>>>> think
> >> >>>>>>>>> it shouldn’t be too hard for us to kick
off both bugfix
> releases
> >> >>>>>> around the
> >> >>>>>>>>> same time.
> >> >>>>>>>>>
> >> >>>>>>>>> Also FYI, here’s the lists of JIRA tickets
tagged with
> "1.2.1” /
> >> >>>>>> “1.1.5”
> >> >>>>>>>>> as the Fix Versions, and are still open.
> >> >>>>>>>>> We should probably want to check if there’s
anything on there
> >> that
> >> >>>> we
> >> >>>>>>>>> should block on for the releases:
> >> >>>>>>>>>
> >> >>>>>>>>> For 1.2.1:
> >> >>>>>>>>> https://issues.apache.org/jira/browse/FLINK-5711?jql=
> >> >>>>>>>>> project%20%3D%20FLINK%20AND%20status%20in%20(Open%2C%20%
> >> >>>>>>>>> 22In%20Progress%22%2C%20Reopened)%20AND%
> >> 20fixVersion%20%3D%201.2.1
> >> >>>>>>>>>
> >> >>>>>>>>> For 1.1.5:
> >> >>>>>>>>> https://issues.apache.org/jira/browse/FLINK-6006?jql=
> >> >>>>>>>>> project%20%3D%20FLINK%20AND%20status%20in%20(Open%2C%20%
> >> >>>>>>>>> 22In%20Progress%22%2C%20Reopened)%20AND%
> >> 20fixVersion%20%3D%201.1.5
> >> >>>>>>>
> >> >>>>>>
> >> >>>>
> >> >>>
> >> >>
> >> >>
> >>
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message