beam-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kenneth Knowles <k...@apache.org>
Subject Re: [VOTE] Release 2.8.0, release candidate #1
Date Mon, 29 Oct 2018 15:55:42 GMT
I think definitely open a cherry pick PR to a 2.8.x branch. I think we must
not corrupt maven central, so if it is published to users this has to be
2.8.1. Ahmet - we are to this point, right?

Kenn

On Mon, Oct 29, 2018 at 8:40 AM Ismaël Mejía <iemejia@gmail.com> wrote:

> First thanks Etienne and Kenn for noting the performance issue. I
> reviewed the discussed PR.It introduced a new ‘@Experimental’ option
> to the Spark runner to change the default source partitioning and
> enable users to control it via a predefined size (a prerrequisite for
> Spark’s dynamicAllocation).
>
> This however must not be the default behavior, it seems after looking
> at the PR that things are not as expected and the default is now the
> new behavior. I will provide a PR to fix this quickly. However the
> question is, should I do cherry pick it and we do a new RC (since the
> release was already 'passed') ?
> On Mon, Oct 29, 2018 at 2:51 PM Kenneth Knowles <kenn@apache.org> wrote:
> >
> > I didn't isolate it to a cause and commit, so that is extremely useful
> to know. To bring some details on thread:
> >
> > query 4: a single aggregation in sliding windows
> > query 8: a single join with no other interesting logic
> > query 9 (prefix of query 6*): find the winning bid for each auction
> > query 6: query 9 followed by a single aggregation
> >
> > Kenn
> >
> > * they seem out of order because the original queries were 1-8 and we
> added 9 later to benchmark the baseline without the aggregation
> >
> > On Mon, Oct 29, 2018 at 3:28 AM Etienne Chauchot <echauchot@apache.org>
> wrote:
> >>
> >> Oops, just saw than Kenn already mentioned spark perf degradation on
> spark runner around 10/05. Sorry for the repetition.
> >> Nevertheless, IMHO, I think it will be still worth checking PR #6181.
> >>
> >> Etienne
> >>
> >> Le lundi 29 octobre 2018 à 10:42 +0100, Etienne Chauchot a écrit :
> >>
> >> Hey,
> >> I would vote -0 : here is the explanation:
> >>
> >> I took a look at Nexmark dashboards for output size and performance for
> all the runners in all the modes around the date of the release cut to
> search for regressions.
> >>
> >> I noted a regression on the performance of the spark runner. Query4,
> Query6, Query8 and Query9 running times were multiplied by 2 to 3 around
> the date of 10/05/18. See
> https://apache-beam-testing.appspot.com/explore?dashboard=5138380291571712
> >> So I searched in the commit history of the spark runner module for what
> happened around 10/05/18. And I found this commit
> >>
> >> e4a1ccbaa10808d88c6ad2a687fe9f6d52392d90: Merge pull request #6181:
> [BEAM-4783] Add bundleSize for splitting BoundedSources
> >>
> >> I don't know if it should be considered a blocker but we should
> definitely take another look at pull request #6181 that seems to change the
> way we split on spark runner.
> >>
> >> Best
> >> Etienne
> >>
> >>
> >> Le vendredi 26 octobre 2018 à 18:20 +0200, Maximilian Michels a écrit :
> >>
> >> +1 (binding)
> >>
> >>
> >> On 26.10.18 17:45, Kenneth Knowles wrote:
> >>
> >> Nice. Thanks.
> >>
> >>
> >> +1
> >>
> >>
> >>
> >> On Fri, Oct 26, 2018 at 8:44 AM Robert Bradshaw <robertwb@google.com
> >>
> >> <mailto:robertwb@google.com>> wrote:
> >>
> >>
> >>     Thanks Tim!
> >>
> >>
> >>     This was my only hesitation, and sounds like we're in the clear
> here.
> >>
> >>
> >>     +1 (binding)
> >>
> >>     On Fri, Oct 26, 2018 at 5:05 PM Tim Robertson
> >>
> >>     <timrobertson100@gmail.com <mailto:timrobertson100@gmail.com>>
> wrote:
> >>
> >>      >
> >>
> >>      > A colleague and I tested on 2.7.0 and 2.8.0RC1:
> >>
> >>      >
> >>
> >>      > 1. Quickstart on Spark/YARN/HDFS (CDH 5.12.0) (commented in
> >>
> >>     spreadsheet)
> >>
> >>      > 2. Our Avro to Avro pipelines on Spark/YARN/HDFS (note we
> >>
> >>     backport the un-merged BEAM-5036 fix in our code)
> >>
> >>      > 3. Our Avro to Elasticsearch pipelines on Spark/YARN/HDFS
> >>
> >>      >
> >>
> >>      > Everything worked, and performance was similar on both.
> >>
> >>      > We built using maven pointing at
> >>
> >>
> https://repository.apache.org/content/repositories/orgapachebeam-1049/
> >>
> >>      >
> >>
> >>      > Based on this limited testing: +1
> >>
> >>      >
> >>
> >>      > Thank you to the release managers,
> >>
> >>      > Tim
> >>
> >>      >
> >>
> >>      >
> >>
> >>      > On Thu, Oct 25, 2018 at 7:21 PM Tim <timrobertson100@gmail.com
> >>
> >>     <mailto:timrobertson100@gmail.com>> wrote:
> >>
> >>      >>
> >>
> >>      >> I can do some tests on Spark / YARN tomorrow (CEST timezone).
> >>
> >>     Sorry I’ve just been too busy to assist.
> >>
> >>      >>
> >>
> >>      >> Tim
> >>
> >>      >>
> >>
> >>      >> On 25 Oct 2018, at 18:59, Kenneth Knowles <kenn@apache.org
> >>
> >>     <mailto:kenn@apache.org>> wrote:
> >>
> >>      >>
> >>
> >>      >> I tried to do a more thorough job on this.
> >>
> >>      >>
> >>
> >>      >>  - I could not reproduce the slowdown in Query 9. I believe the
> >>
> >>     variance was simply high given the parameters and environment
> >>
> >>      >>  - I saw the same slowdown in Query 8 when running as part of
> >>
> >>     the suite, but it vanished when I ran repeatedly on its own, so
> >>
> >>     again it is not good methodology probably
> >>
> >>      >>
> >>
> >>      >> We do have the dashboard at
> >>
> >>     https://apache-beam-testing.appspot.com/dashboard-admin though no
> >>
> >>     anomaly detection set up AFAIK.
> >>
> >>      >>
> >>
> >>      >>  - There is no issue easily visible in DirectRunner:
> >>
> >>
> https://apache-beam-testing.appspot.com/explore?dashboard=5084698770407424
> >>
> >>      >>  - There is a notable degradation in Spark runner on 10/5 for
> >>
> >>     many queries.
> >>
> >>
> https://apache-beam-testing.appspot.com/explore?dashboard=5138380291571712
> >>
> >>      >>  - Something minor happened for Dataflow around 10/1:
> >>
> >>
> https://apache-beam-testing.appspot.com/explore?dashboard=5670405876482048
> >>
> >>      >>  - Flink runner seems to have had some fantastic improvements
> >>
> >>     :-)
> >>
> >>
> https://apache-beam-testing.appspot.com/explore?dashboard=5699257587728384
> >>
> >>      >>
> >>
> >>      >> So if there is a blocker it would really be the Spark runner
> >>
> >>     perf changes. Of course, all these except Dataflow are using local
> >>
> >>     instances so may not be representative of larger scale AFAIK.
> >>
> >>      >>
> >>
> >>      >> Kenn
> >>
> >>      >>
> >>
> >>      >> On Wed, Oct 24, 2018 at 9:48 AM Maximilian Michels
> >>
> >>     <mxm@apache.org <mailto:mxm@apache.org>> wrote:
> >>
> >>      >>>
> >>
> >>      >>> I've run WordCount using Quickstart with the FlinkRunner
> >>
> >>     (locally and
> >>
> >>      >>> against a Flink cluster).
> >>
> >>      >>>
> >>
> >>      >>> Would give a +1 but waiting what Kenn finds.
> >>
> >>      >>>
> >>
> >>      >>> -Max
> >>
> >>      >>>
> >>
> >>      >>> On 23.10.18 07:11, Ahmet Altay wrote:
> >>
> >>      >>> >
> >>
> >>      >>> >
> >>
> >>      >>> > On Mon, Oct 22, 2018 at 10:06 PM, Kenneth Knowles
> >>
> >>     <kenn@apache.org <mailto:kenn@apache.org>
> >>
> >>      >>> > <mailto:kenn@apache.org <mailto:kenn@apache.org>>>
wrote:
> >>
> >>      >>> >
> >>
> >>      >>> >     You two did so much verification I had a hard time
> >>
> >>     finding something
> >>
> >>      >>> >     where my help was meaningful! :-)
> >>
> >>      >>> >
> >>
> >>      >>> >     I did run the Nexmark suite on the DirectRunner against
> >>
> >>     2.7.0 and
> >>
> >>      >>> >     2.8.0 following
> >>
> >>      >>> >
> >>
> >>
> https://beam.apache.org/documentation/sdks/java/nexmark/#running-smoke-suite-on-the-directrunner-local
> >>
> >>      >>> >
> >>
> >>       <
> https://beam.apache.org/documentation/sdks/java/nexmark/#running-smoke-suite-on-the-directrunner-local
> >.
> >>
> >>      >>> >
> >>
> >>      >>> >     It is admittedly a very silly test - the instructions
> leave
> >>
> >>      >>> >     immutability enforcement on, etc. But it does appear
that
> >>
> >>     there is a
> >>
> >>      >>> >     30% degradation in query 8 and 15% in query 9. These
are
> >>
> >>     the pure
> >>
> >>      >>> >     Java tests, not the SQL variants. The rest of the
queries
> >>
> >>     are close
> >>
> >>      >>> >     enough that differences are not meaningful.
> >>
> >>      >>> >
> >>
> >>      >>> >
> >>
> >>      >>> > (It would be a good improvement for us to have alerts
on
> daily
> >>
> >>      >>> > benchmarks if we do not have such a concept already.)
> >>
> >>      >>> >
> >>
> >>      >>> >
> >>
> >>      >>> >     I would ask a little more time to see what is going
on
> >>
> >>     here - is it
> >>
> >>      >>> >     a real performance issue or an artifact of how the
tests
> are
> >>
> >>      >>> >     invoked, or ...?
> >>
> >>      >>> >
> >>
> >>      >>> >
> >>
> >>      >>> > Thank you! Much appreciated. Please let us know when
you are
> >>
> >>     done with
> >>
> >>      >>> > your investigation.
> >>
> >>      >>> >
> >>
> >>      >>> >
> >>
> >>      >>> >     Kenn
> >>
> >>      >>> >
> >>
> >>      >>> >     On Mon, Oct 22, 2018 at 6:20 PM Ahmet Altay
> >>
> >>     <altay@google.com <mailto:altay@google.com>
> >>
> >>      >>> >     <mailto:altay@google.com <mailto:altay@google.com>>>
> wrote:
> >>
> >>      >>> >
> >>
> >>      >>> >         Hi all,
> >>
> >>      >>> >
> >>
> >>      >>> >         Did you have a chance to review this RC? Between
me
> >>
> >>     and Robert
> >>
> >>      >>> >         we ran a significant chunk of the validations.
Let me
> >>
> >>     know if
> >>
> >>      >>> >         you have any questions.
> >>
> >>      >>> >
> >>
> >>      >>> >         Ahmet
> >>
> >>      >>> >
> >>
> >>      >>> >         On Thu, Oct 18, 2018 at 5:26 PM, Ahmet Altay
> >>
> >>     <altay@google.com <mailto:altay@google.com>
> >>
> >>      >>> >         <mailto:altay@google.com <mailto:altay@google.com>>>
> >>
> >>     wrote:
> >>
> >>      >>> >
> >>
> >>      >>> >             Hi everyone,
> >>
> >>      >>> >
> >>
> >>      >>> >             Please review and vote on the release candidate
> >>
> >>     #1 for the
> >>
> >>      >>> >             version 2.8.0, as follows:
> >>
> >>      >>> >             [ ] +1, Approve the release
> >>
> >>      >>> >             [ ] -1, Do not approve the release (please
> >>
> >>     provide specific
> >>
> >>      >>> >             comments)
> >>
> >>      >>> >
> >>
> >>      >>> >             The complete staging area is available for
your
> >>
> >>     review,
> >>
> >>      >>> >             which includes:
> >>
> >>      >>> >             * JIRA release notes [1],
> >>
> >>      >>> >             * the official Apache source release to be
> >>
> >>     deployed to
> >>
> >>      >>> > dist.apache.org <http://dist.apache.org>
> >>
> >>     <http://dist.apache.org> [2], which is
> >>
> >>      >>> >             signed with the key with fingerprint 6096FA00
> [3],
> >>
> >>      >>> >             * all artifacts to be deployed to the Maven
> Central
> >>
> >>      >>> >             Repository [4],
> >>
> >>      >>> >             * source code tag "v2.8.0-RC1" [5],
> >>
> >>      >>> >             * website pull request listing the release
and
> >>
> >>     publishing
> >>
> >>      >>> >             the API reference manual [6].
> >>
> >>      >>> >             * Python artifacts are deployed along with
the
> source
> >>
> >>      >>> >             release to the dist.apache.org
> >>
> >>     <http://dist.apache.org> <http://dist.apache.org> [2].
> >>
> >>      >>> >             * Validation sheet with a tab for 2.8.0 release
> >>
> >>     to help with
> >>
> >>      >>> >             validation [7].
> >>
> >>      >>> >
> >>
> >>      >>> >             The vote will be open for at least 72 hours.
It
> >>
> >>     is adopted
> >>
> >>      >>> >             by majority approval, with at least 3 PMC
> >>
> >>     affirmative votes.
> >>
> >>      >>> >
> >>
> >>      >>> >             Thanks,
> >>
> >>      >>> >             Ahmet
> >>
> >>      >>> >
> >>
> >>      >>> >             [1]
> >>
> >>      >>> >
> >>
> >>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=12343985
> >>
> >>      >>> >
> >>
> >>       <
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12319527&version=12343985
> >
> >>
> >>      >>> >             [2]
> https://dist.apache.org/repos/dist/dev/beam/2.8.0
> >>
> >>      >>> >             <
> https://dist.apache.org/repos/dist/dev/beam/2.8.0>
> >>
> >>      >>> >             [3]
> https://dist.apache.org/repos/dist/dev/beam/KEYS
> >>
> >>      >>> >             <
> https://dist.apache.org/repos/dist/dev/beam/KEYS>
> >>
> >>      >>> >             [4]
> >>
> >>      >>> >
> >>
> >>
> https://repository.apache.org/content/repositories/orgapachebeam-1049/
> >>
> >>      >>> >
> >>
> >>       <
> https://repository.apache.org/content/repositories/orgapachebeam-1049/>
> >>
> >>      >>> >             [5]
> https://github.com/apache/beam/tree/v2.8.0-RC1
> >>
> >>      >>> >             <https://github.com/apache/beam/tree/v2.8.0-RC1>
> >>
> >>      >>> >             [6] https://github.com/apache/beam-site/pull/583
> >>
> >>      >>> >             <https://github.com/apache/beam-site/pull/583>
> and
> >>
> >>      >>> > https://github.com/apache/beam/pull/6745
> >>
> >>      >>> >             <https://github.com/apache/beam/pull/6745>
> >>
> >>      >>> >             [7]
> >>
> >>      >>> >
> >>
> >>
> https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1854712816
> >>
> >>      >>> >
> >>
> >>       <
> https://docs.google.com/spreadsheets/d/1qk-N5vjXvbcEk68GjbkSZTR8AGqyNUM-oLFo_ZXBpJw/edit#gid=1854712816
> >
> >>
> >>      >>> >
> >>
> >>      >>> >
> >>
> >>      >>> >
> >>
> >>
>

Mime
View raw message