flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yu Li <car...@gmail.com>
Subject Re: [VOTE] Release 1.8.0, release candidate #3
Date Wed, 20 Mar 2019 23:22:47 GMT
-1, observed stably failure on streaming bucketing end-to-end test case in
two different environments (Linux/MacOS) when running with both shaded
hadoop-2.8.3 jar file
<https://repository.apache.org/content/repositories/orgapacheflink-1213/org/apache/flink/flink-shaded-hadoop2-uber/2.8.3-1.8.0/flink-shaded-hadoop2-uber-2.8.3-1.8.0.jar>
and hadoop-2.8.5 dist
<http://archive.apache.org/dist/hadoop/core/hadoop-2.8.5/>, while both env
could pass with hadoop 2.6.5. More details please refer to this comment
<https://issues.apache.org/jira/browse/FLINK-11972?focusedCommentId=16797614&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16797614>
in FLINK-11972.

Best Regards,
Yu


On Thu, 21 Mar 2019 at 04:25, jincheng sun <sunjincheng121@gmail.com> wrote:

> Thanks for the quick fix Aljoscha! The FLINK-11971
> <https://issues.apache.org/jira/browse/FLINK-11971> has been merged.
>
> Cheers,
> Jincheng
>
> Piotr Nowojski <piotr@ververica.com> 于2019年3月21日周四 上午12:29写道:
>
>> -1 from my side due to performance regression found in the master branch
>> since Jan 29th.
>>
>> In 10% JVM forks it was causing huge performance drop in some of the
>> benchmarks (up to 30-50% reduced throughput), which could mean that one out
>> of 10 task managers could be affected by it. Today we have merged a fix for
>> it [1]. First benchmark run was promising [2], but we have to wait until
>> tomorrow to make sure that the problem was definitely resolved. If that’s
>> the case, I would recommend including it in 1.8.0, because we really do not
>> know how big of performance regression this issue can be in the real world
>> scenarios.
>>
>> Regarding the second regression from mid February. We have found the
>> responsible commit and this one is probably just a false positive. Because
>> of the nature some of the benchmarks, they are running with low number of
>> records (300k). The apparent performance regression was caused by higher
>> initialisation time. When I temporarily increased the number of records to
>> 2M, the regression was gone. Together with Till and Stefan Richter we
>> discussed the potential impact of this longer initialisation time (in the
>> case of said benchmarks initialisation time increased from 70ms to 120ms)
>> and we think that it’s not a critical issue, that doesn’t have to block the
>> release. Nevertheless there might some follow up work for this.
>>
>> [1] https://github.com/apache/flink/pull/8020
>> [2] http://codespeed.dak8s.net:8000/timeline/?ben=tumblingWindow&env=2
>>
>> Piotr Nowojski
>>
>> On 20 Mar 2019, at 10:09, Aljoscha Krettek <aljoscha@apache.org> wrote:
>>
>> Thanks Jincheng! It would be very good to fix those but as you said, I
>> would say they are not blockers.
>>
>> On 20. Mar 2019, at 09:47, Kurt Young <ykt836@gmail.com> wrote:
>>
>> +1 (non-binding)
>>
>> Checked items:
>> - checked checksums and GPG files
>> - verified that the source archives do not contains any binaries
>> - checked that all POM files point to the same version
>> - build from source successfully
>>
>> Best,
>> Kurt
>>
>>
>> On Wed, Mar 20, 2019 at 2:12 PM jincheng sun <sunjincheng121@gmail.com>
>> wrote:
>>
>>> Hi Aljoscha&All,
>>>
>>> When I did the `end-to-end` test for RC3 under Mac OS, I found the
>>> following two problems:
>>>
>>> 1. The verification returned for different `minikube status` is is not
>>> enough for the robustness. The strings returned by different versions of
>>> different platforms are different. the following misjudgment is caused:
>>> When the `Command: start_kubernetes_if_not_ruunning failed` error
>>> occurs, the minikube has actually started successfully. The core reason is
>>> that there is a bug in the `test_kubernetes_embedded_job.sh` script. See
>>> FLINK-11971 <https://issues.apache.org/jira/browse/FLINK-11971> for
>>> details.
>>>
>>> 2. Since the difference between 1.8.0 and 1.7.x is that 1.8.x does not
>>> put the `hadoop-shaded` JAR integrated into the dist.  It will cause an
>>> error when the end-to-end test cannot be found with `Hadoop` Related
>>> classes,  such as: `java.lang.NoClassDefFoundError:
>>> Lorg/apache/hadoop/fs/FileSystem`. So we need to improve the end-to-end
>>> test script, or explicitly stated in the README, i.e. end-to-end test need
>>> to add `flink-shaded-hadoop2-uber-XXXX.jar` to the classpath. See
>>> FLINK-11972 <https://issues.apache.org/jira/browse/FLINK-11972> for
>>> details.
>>>
>>> I think this is not a blocker for release-1.8.0, but I think it would be
>>> better to include those commits in release-1.8 If we still have performance
>>> related bugs should be fixed.
>>>
>>> What do you think?
>>>
>>> Best,
>>> Jincheng
>>>
>>>
>>> Aljoscha Krettek <aljoscha@apache.org> 于2019年3月19日周二 下午7:58写道:
>>>
>>>> Hi All,
>>>>
>>>> The release process for Flink 1.8.0 is currently ongoing. Please have a
>>>> look at the thread, in case you’re interested in checking your applications
>>>> against this next release of Apache Flink and participate in the process.
>>>>
>>>> Best,
>>>> Aljoscha
>>>>
>>>> Begin forwarded message:
>>>>
>>>> *From: *Aljoscha Krettek <aljoscha@apache.org>
>>>> *Subject: **[VOTE] Release 1.8.0, release candidate #3*
>>>> *Date: *19. March 2019 at 12:52:50 CET
>>>> *To: *dev@flink.apache.org
>>>> *Reply-To: *dev@flink.apache.org
>>>>
>>>> Hi everyone,
>>>> Please review and vote on the release candidate 3 for Flink 1.8.0, as
>>>> follows:
>>>> [ ] +1, Approve the release
>>>> [ ] -1, Do not approve the release (please provide specific comments)
>>>>
>>>>
>>>> The complete staging area is available for your review, which includes:
>>>> * JIRA release notes [1],
>>>> * the official Apache source release and binary convenience releases to
>>>> be deployed to dist.apache.org <http://dist.apache.org/> [2], which
>>>> are signed with the key with fingerprint
>>>> F2A67A8047499BBB3908D17AA8F4FD97121D7293 [3],
>>>> * all artifacts to be deployed to the Maven Central Repository [4],
>>>> * source code tag "release-1.8.0-rc3" [5],
>>>> * website pull request listing the new release [6]
>>>> * website pull request adding announcement blog post [7].
>>>>
>>>> The vote will be open for at least 72 hours. It is adopted by majority
>>>> approval, with at least 3 PMC affirmative votes.
>>>>
>>>> Thanks,
>>>> Aljoscha
>>>>
>>>> [1]
>>>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274
>>>> <
>>>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274
>>>> >
>>>> [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc3/ <
>>>> https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc3/>
>>>> [3] https://dist.apache.org/repos/dist/release/flink/KEYS <
>>>> https://dist.apache.org/repos/dist/release/flink/KEYS>
>>>> [4]
>>>> https://repository.apache.org/content/repositories/orgapacheflink-1214
>>>> <https://repository.apache.org/content/repositories/orgapacheflink-1214>
>>>>
>>>> [5]
>>>> https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=b505c0822edd2aed7fa22ed75eca40dca1a9de42
>>>> <
>>>> https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=b505c0822edd2aed7fa22ed75eca40dca1a9de42>
>>>>
>>>> [6] https://github.com/apache/flink-web/pull/180 <
>>>> https://github.com/apache/flink-web/pull/180>
>>>> [7] https://github.com/apache/flink-web/pull/179 <
>>>> https://github.com/apache/flink-web/pull/179>
>>>>
>>>> P.S. The difference to the previous RCs 1 and 2 is very small, you can
>>>> fetch the tags and do a "git log release-1.8.0-rc1..release-1.8.0-rc3”
to
>>>> see the difference in commits. Its fixes for the issues that led to the
>>>> cancellation of the previous RCs plus smaller fixes. Most
>>>> verification/testing that was carried out should apply as is to this RC.
>>>> Any functional verification that you did on previous RCs should therefore
>>>> easily carry over to this one.
>>>>
>>>>
>>>>
>>
>>

Mime
View raw message