flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ufuk Celebi <...@apache.org>
Subject Re: [DISCUSS] Releasing Flink 1.1.4
Date Fri, 28 Oct 2016 12:27:26 GMT
Thanks for all your feedback.

If there are no objections, I would like to stick to the mentioned
issues in this thread and create RC1 as soon as they are all
addressed. This will probably not be this week though, but it looks
good for next week.

DONE
=====
- FLINK-4619: Answer client if savepoint restore fails
- FLINK-4715: Safety net for stuck task cancellation
- FLINK-4510: Always create CheckpointCoordinator
- FLINK-4894: Don't block on buffer request after broadcast event
- FLINK-4298: Add proper repository for Closure dependencies
- FLINK-4218: Do not fail checkpoints when state size cannot be determined
- FLINK-3347: TaskManager (or its ActorSystem) need to restart in case
they notice quarantine
- FLINK-4875: Use correct operator name
- FLINK-4913: Include user jars in system class loader

PENDING REVIEW
===============
- FLINK-4445: Add option to ignore unmatched state when restoring from
savepoint => https://github.com/apache/flink/pull/2713
- FLINK-4932: Don't let ExecutionGraph fail when in state Restarting
=> https://github.com/apache/flink/pull/2711
- FLINK-4933: ExecutionGraph.scheduleOrUpdateConsumers can fail the
ExecutionGraph => https://github.com/apache/flink/pull/2701

OPEN
=====
- FLINK-4904: Add a limit for how much data may be spilled in
checkpoint alignments => fix pending
- FLINK-4910: Introduce safety net for closing file system streams =>
@Stephan, Stefan: What's the conclusion of your discussion whether to
backport this or not?


On Wed, Oct 26, 2016 at 9:57 PM, dan bress <danbress@gmail.com> wrote:
> +1 for this release,
> also +1 to Chesnay's suggesting for including this: [FLINK-4875] [metrics]
> Use correct operator name
>
> Dan
>
> On Wed, Oct 26, 2016 at 5:06 AM Till Rohrmann <trohrmann@apache.org> wrote:
>
>> I'll work on FLINK-3347. Additionally I would like to get in
>>
>> - https://issues.apache.org/jira/browse/FLINK-4932: Don't let
>> ExecutionGraph fail when in state Restarting
>> - https://issues.apache.org/jira/browse/FLINK-4933:
>> ExecutionGraph.scheduleOrUpdateConsumers
>> can fail the ExecutionGraph
>>
>> Cheers,
>> Till
>>
>> On Wed, Oct 26, 2016 at 1:02 PM, Stephan Ewen <sewen@apache.org> wrote:
>>
>> > Concerning backporting the "I/O streams safety net" - we need to make
>> sure
>> > that this does not change any behavior that users may implicitly expect.
>> >
>> >
>> > On Wed, Oct 26, 2016 at 11:21 AM, Maximilian Michels <mxm@apache.org>
>> > wrote:
>> >
>> > > +1 for a 1.1.4 release
>> > >
>> > > We could backport putting user jars into the system class loader for
>> > > per-job Yarn clusters: https://github.com/apache/flink/pull/2692
>> > > Arguably, this is somewhat a new feature but it gets rid of duplicate
>> > > class loading issues users experienced in practice.
>> > >
>> > > We already have the following commits on the release-1.1 branch:
>> > >
>> > > 05a5f46 [FLINK-4862] fix Timer register in ContinuousEventTimeTrigger
>> > > 5731672 [FLINK-4581] [table] Fix Table API throwing "No suitable driver
>> > > found for jdbc:calcite"
>> > > 9c87f92 [FLINK-4586] [core] Broken AverageAccumulator
>> > > 210230c [FLINK-4829] snapshot accumulators on a best-effort basis
>> > > c1d6b24 [FLINK-4829] protect user accumulators against concurrent
>> updates
>> > > fe464b4 [FLINK-4709] [core] Fix resource leak in
>> > InputStreamFSInputWrapper
>> > > 9f72698 [FLINK-4108] [scala] Respect ResultTypeQueryable for
>> > InputFormats.
>> > > 9591d50 [FLINK-4506] [DataSet] Fix documentation of CsvOutputFormat
>> about
>> > > incorrect default of allowNullValues
>> > > c9433bf [FLINK-3706] Fix YARN test instability
>> > > 2203f74 [FLINK-4778] [docs] Fix WordCount parameters in CLI examples.
>> > >
>> > > -Max
>> > >
>> > >
>> > > On Wed, Oct 26, 2016 at 7:05 AM, Jean-Baptiste Onofré <jb@nanthrax.net
>> >
>> > > wrote:
>> > > > +1
>> > > >
>> > > > Looking forward this release !
>> > > >
>> > > > Regards
>> > > > JB
>> > > >
>> > > > ⁣
>> > > >
>> > > > On Oct 25, 2016, 14:43, at 14:43, Robert Metzger <
>> rmetzger@apache.org>
>> > > wrote:
>> > > >>+1 for a bugfix release soon.
>> > > >>
>> > > >>On Tue, Oct 25, 2016 at 10:53 AM, Stephan Ewen <sewen@apache.org>
>> > > >>wrote:
>> > > >>
>> > > >>> Thanks fort starting this Ufuk.
>> > > >>>
>> > > >>> I would like to add the following issues to 1.1.4:
>> > > >>>
>> > > >>> Build errors due to Storm dependencies *(fix pending)*
>> > > >>>     - [FLINK-4298] [storm compatibility] Add proper repository
for
>> > > >>Closure
>> > > >>> dependencies.
>> > > >>>
>> > > >>> Stability on S3 considering eventual consistency *(fix pending)*
>> > > >>>     - [FLINK-4218] [checkpoints] Do not fail checkpoints when
state
>> > > >>size
>> > > >>> cannot be determined
>> > > >>>
>> > > >>> Avoiding Zombie TaskManagers *(still needs to be done)*
>> > > >>>     - [FLINK-3347] [akka] TaskManager (or its ActorSystem)
need to
>> > > >>restart
>> > > >>> in case they notice quarantine
>> > > >>>
>> > > >>> Adding a limit to the amount of data spilled during checkpoint
>> > > >>alignments
>> > > >>> *(fix
>> > > >>> is work in progress)*
>> > > >>>     - [FLINK-4904] [checkpoints] Add a limit for how much
data may
>> be
>> > > >>> spilled in checkpoint alignments
>> > > >>>
>> > > >>>
>> > > >>> I can push the first two fixes to the 1.1.4 branch in a bit,
the
>> > > >>fourth one
>> > > >>> later today.
>> > > >>> The third one (akka) is still pending.
>> > > >>>
>> > > >>> Best,
>> > > >>> Stephan
>> > > >>>
>> > > >>>
>> > > >>>
>> > > >>> On Mon, Oct 24, 2016 at 3:32 PM, Ufuk Celebi <uce@apache.org>
>> wrote:
>> > > >>>
>> > > >>> > Hey all,
>> > > >>> >
>> > > >>> > I would like to start the discussion for kicking off
the next bug
>> > > >>fix
>> > > >>> > release, Flink 1.1.4. What do you think about aiming
for a RC by
>> > > >>end
>> > > >>> > of this week?
>> > > >>> >
>> > > >>> > Users reported some instabilities/inconveniences that
would be
>> good
>> > > >>to
>> > > >>> fix.
>> > > >>> >
>> > > >>> > Personally, I would like to backport the following fixes:
>> > > >>> >
>> > > >>> > (1) https://issues.apache.org/jira/browse/FLINK-4619:
Answer
>> > client
>> > > >>if
>> > > >>> > savepoint restore fails (Already merged for master, needs
minimal
>> > > >>> > adjustment for 1.1)
>> > > >>> > (2) https://issues.apache.org/jira/browse/FLINK-4715:
Safety net
>> > > >>for
>> > > >>> > stuck task cancellation (Already reviewed for master,
waiting for
>> > > >>> > tests to finish of backport)
>> > > >>> > (3) https://issues.apache.org/jira/browse/FLINK-4510:
Always
>> > create
>> > > >>> > CheckpointCoordinator (Already merged for master, needs
minimal
>> > > >>> > adjustments for 1.1)
>> > > >>> >
>> > > >>> > Furthermore, I would like to address the following:
>> > > >>> >
>> > > >>> > (4) https://issues.apache.org/jira/browse/FLINK-4445:
Add option
>> > to
>> > > >>> > ignore unmatched state when restoring from savepoint
>> > > >>> > (5) https://issues.apache.org/jira/browse/FLINK-4894:
Don't
>> block
>> > > >>on
>> > > >>> > buffer request after broadcast event
>> > > >>> >
>> > > >>> > Strictly speaking, the (4) is not a bug fix. But given
that it
>> > > >>would
>> > > >>> > only add an optional flag to savepoint restoring and
should have
>> > > >>been
>> > > >>> > addressed for 1.1.0 already, I would like to get it in.
>> > > >>> >
>> > > >>>
>> > >
>> >
>>

Mime
View raw message