Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 928CE200C78 for ; Thu, 4 May 2017 01:45:55 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 913A6160BBA; Wed, 3 May 2017 23:45:55 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 13D90160BB5 for ; Thu, 4 May 2017 01:45:53 +0200 (CEST) Received: (qmail 18727 invoked by uid 500); 3 May 2017 23:45:51 -0000 Mailing-List: contact dev-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list dev@spark.apache.org Received: (qmail 18716 invoked by uid 99); 3 May 2017 23:45:50 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 03 May 2017 23:45:50 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 35A3D1A068B for ; Wed, 3 May 2017 23:45:50 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.48 X-Spam-Level: ** X-Spam-Status: No, score=2.48 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=databricks-com.20150623.gappssmtp.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id IbPeqI7bwyWN for ; Wed, 3 May 2017 23:45:46 +0000 (UTC) Received: from mail-lf0-f43.google.com (mail-lf0-f43.google.com [209.85.215.43]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 69CA65F36F for ; Wed, 3 May 2017 23:45:45 +0000 (UTC) Received: by mail-lf0-f43.google.com with SMTP id h4so2299391lfj.3 for ; Wed, 03 May 2017 16:45:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=databricks-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=4XFEt7ifEqIFIQH7DSHDTxVv9BVzl432vHfu8F6v0e8=; b=x9u0i5M/qMQeMQXOyUaECibU+MbQEC2ppRX5thKZnPjEEpzWntytjfQIf9pdqA+wes MydsL+37G4/egFWsyCima9TKzKdo/Pe/umxdNiugJHX4fpfh1MGjxQluDq2AY21wtYiE 41J6mb3YXtjyd1GR4++vgQJgiyUKq1LO6K0iq8LXpufsUK+jN/BRp8cucgoBu3K3H36z O7fuSgs4TL9bc5QMMhzyhgUreOF5rVYaX1ySC6Yiz5gO4rznfGgdReU0f0tN4ek/fe5J 5ETfY7OhOWrhDXIHbSBge7SyG9DJeEJJ3fUvsaM/4IFnp+B0pFDV4cA6RGaIEAS+4cWJ GZWA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=4XFEt7ifEqIFIQH7DSHDTxVv9BVzl432vHfu8F6v0e8=; b=QOgjbpXd0qMxWCefmickEmluN5OuVR0jLpUVARm/z784+078pCkcj8hkM2hWhNgtAy FqWlmNlsZjHM+Wk5TczSXmfbMNMA4g1+yzmgBY/rhrUK77z9oIIY8ioVa4AhgLsoIeo/ mSvb7P+/sPi/0bse5EPnGfojhQc3a7qh53RIrdyzX7UFuIn76TXxAsSSUcgfXh59oHrw mNi28Sd4PjYOdKCdJayA79dKAvKEaZAjhG34niaVbunC2HOrwTW/t1WQskR/zqiPYp5j qojjbrv90AioGb9d8s1D7EpV0RLecGGwc5HtBLXWRavFc8Jvk/ePhphqvcz3BrLnRV46 JdDQ== X-Gm-Message-State: AN3rC/7LFws3oF3psWCWDisDT/ieOWnCXt+4dPqnZyROUw3mbcm3n+AK ELGnuxUF9XWKiXnnCqnPKp8v71v6OU0J5nw= X-Received: by 10.46.84.5 with SMTP id i5mr13875494ljb.76.1493855144658; Wed, 03 May 2017 16:45:44 -0700 (PDT) MIME-Version: 1.0 Received: by 10.25.212.6 with HTTP; Wed, 3 May 2017 16:45:23 -0700 (PDT) In-Reply-To: References: From: Michael Armbrust Date: Wed, 3 May 2017 16:45:23 -0700 Message-ID: Subject: Re: [VOTE] Apache Spark 2.2.0 (RC1) To: Nick Pentreath Cc: "dev@spark.apache.org" Content-Type: multipart/alternative; boundary=f403045fbf9a3e5f1a054ea741b3 archived-at: Wed, 03 May 2017 23:45:55 -0000 --f403045fbf9a3e5f1a054ea741b3 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable I'm going to -1 this given the number of small bug fixes that have gone into the release branch. I'll follow with another RC shortly. On Tue, May 2, 2017 at 7:35 AM, Nick Pentreath wrote: > I won't +1 just given that it seems certain there will be another RC and > there are the outstanding ML QA blocker issues. > > But clean build and test for JVM and Python tests LGTM on CentOS Linux > 7.2.1511, OpenJDK 1.8.0_111 > > > On Mon, 1 May 2017 at 22:42 Frank Austin Nothaft > wrote: > >> Hi Ryan, >> >> IMO, the problem is that the Spark Avro version conflicts with the >> Parquet Avro version. As discussed upthread, I don=E2=80=99t think there= =E2=80=99s a way to >> *reliably *make sure that Avro 1.8 is on the classpath first while using >> spark-submit. Relocating avro in our project wouldn=E2=80=99t solve the = problem, >> because the MethodNotFoundError is thrown from the internals of the >> ParquetAvroOutputFormat, not from code in our project. >> >> Regards, >> >> Frank Austin Nothaft >> fnothaft@berkeley.edu >> fnothaft@eecs.berkeley.edu >> 202-340-0466 <(202)%20340-0466> >> >> On May 1, 2017, at 12:33 PM, Ryan Blue wrote: >> >> Michael, I think that the problem is with your classpath. >> >> Spark has a dependency to 1.7.7, which can't be changed. Your project is >> what pulls in parquet-avro and transitively Avro 1.8. Spark has no runti= me >> dependency on Avro 1.8. It is understandably annoying that using the sam= e >> version of Parquet for your parquet-avro dependency is what causes your >> project to depend on Avro 1.8, but Spark's dependencies aren't a problem >> because its Parquet dependency doesn't bring in Avro. >> >> There are a few ways around this: >> 1. Make sure Avro 1.8 is found in the classpath first >> 2. Shade Avro 1.8 in your project (assuming Avro classes aren't shared) >> 3. Use parquet-avro 1.8.1 in your project, which I think should work wit= h >> 1.8.2 and avoid the Avro change >> >> The work-around in Spark is for tests, which do use parquet-avro. We can >> look at a Parquet 1.8.3 that avoids this issue, but I think this is >> reasonable for the 2.2.0 release. >> >> rb >> >> On Mon, May 1, 2017 at 12:08 PM, Michael Heuer wrote= : >> >>> Please excuse me if I'm misunderstanding -- the problem is not with our >>> library or our classpath. >>> >>> There is a conflict within Spark itself, in that Parquet 1.8.2 expects >>> to find Avro 1.8.0 on the runtime classpath and sees 1.7.7 instead. Sp= ark >>> already has to work around this for unit tests to pass. >>> >>> >>> >>> On Mon, May 1, 2017 at 2:00 PM, Ryan Blue wrote: >>> >>>> Thanks for the extra context, Frank. I agree that it sounds like your >>>> problem comes from the conflict between your Jars and what comes with >>>> Spark. Its the same concern that makes everyone shudder when anything = has a >>>> public dependency on Jackson. :) >>>> >>>> What we usually do to get around situations like this is to relocate >>>> the problem library inside the shaded Jar. That way, Spark uses its ve= rsion >>>> of Avro and your classes use a different version of Avro. This works i= f you >>>> don't need to share classes between the two. Would that work for your >>>> situation? >>>> >>>> rb >>>> >>>> On Mon, May 1, 2017 at 11:55 AM, Koert Kuipers >>>> wrote: >>>> >>>>> sounds like you are running into the fact that you cannot really put >>>>> your classes before spark's on classpath? spark's switches to support= this >>>>> never really worked for me either. >>>>> >>>>> inability to control the classpath + inconsistent jars =3D> trouble ? >>>>> >>>>> On Mon, May 1, 2017 at 2:36 PM, Frank Austin Nothaft < >>>>> fnothaft@berkeley.edu> wrote: >>>>> >>>>>> Hi Ryan, >>>>>> >>>>>> We do set Avro to 1.8 in our downstream project. We also set Spark a= s >>>>>> a provided dependency, and build an =C3=BCberjar. We run via spark-s= ubmit, which >>>>>> builds the classpath with our =C3=BCberjar and all of the Spark deps= . This leads >>>>>> to avro 1.7.1 getting picked off of the classpath at runtime, which = causes >>>>>> the no such method exception to occur. >>>>>> >>>>>> Regards, >>>>>> >>>>>> Frank Austin Nothaft >>>>>> fnothaft@berkeley.edu >>>>>> fnothaft@eecs.berkeley.edu >>>>>> 202-340-0466 <(202)%20340-0466> >>>>>> >>>>>> On May 1, 2017, at 11:31 AM, Ryan Blue wrote: >>>>>> >>>>>> Frank, >>>>>> >>>>>> The issue you're running into is caused by using parquet-avro with >>>>>> Avro 1.7. Can't your downstream project set the Avro dependency to 1= .8? >>>>>> Spark can't update Avro because it is a breaking change that would f= orce >>>>>> users to rebuilt specific Avro classes in some cases. But you should= be >>>>>> free to use Avro 1.8 to avoid the problem. >>>>>> >>>>>> On Mon, May 1, 2017 at 11:08 AM, Frank Austin Nothaft < >>>>>> fnothaft@berkeley.edu> wrote: >>>>>> >>>>>>> Hi Ryan et al, >>>>>>> >>>>>>> The issue we=E2=80=99ve seen using a build of the Spark 2.2.0 branc= h from a >>>>>>> downstream project is that parquet-avro uses one of the new Avro 1.= 8.0 >>>>>>> methods, and you get a NoSuchMethodError since Spark puts Avro 1.7.= 7 as a >>>>>>> dependency. My colleague Michael (who posted earlier on this thread= ) >>>>>>> documented this in Spark-19697 >>>>>>> . I know that >>>>>>> Spark has unit tests that check this compatibility issue, but it lo= oks like >>>>>>> there was a recent change that sets a test scope dependency on Avro >>>>>>> 1.8.0 >>>>>>> , >>>>>>> which masks this issue in the unit tests. With this error, you can= =E2=80=99t use >>>>>>> the ParquetAvroOutputFormat from a application running on Spark 2.2= .0. >>>>>>> >>>>>>> Regards, >>>>>>> >>>>>>> Frank Austin Nothaft >>>>>>> fnothaft@berkeley.edu >>>>>>> fnothaft@eecs.berkeley.edu >>>>>>> 202-340-0466 <(202)%20340-0466> >>>>>>> >>>>>>> On May 1, 2017, at 10:02 AM, Ryan Blue >>>>>> > wrote: >>>>>>> >>>>>>> I agree with Sean. Spark only pulls in parquet-avro for tests. For >>>>>>> execution, it implements the record materialization APIs in Parquet= to go >>>>>>> directly to Spark SQL rows. This doesn't actually leak an Avro 1.8 >>>>>>> dependency into Spark as far as I can tell. >>>>>>> >>>>>>> rb >>>>>>> >>>>>>> On Mon, May 1, 2017 at 8:34 AM, Sean Owen >>>>>>> wrote: >>>>>>> >>>>>>>> See discussion at https://github.com/apache/spark/pull/17163 -- I >>>>>>>> think the issue is that fixing this trades one problem for a sligh= tly >>>>>>>> bigger one. >>>>>>>> >>>>>>>> >>>>>>>> On Mon, May 1, 2017 at 4:13 PM Michael Heuer >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Version 2.2.0 bumps the dependency version for parquet to 1.8.2 >>>>>>>>> but does not bump the dependency version for avro (currently at 1= .7.7). >>>>>>>>> Though perhaps not clear from the issue I reported [0], this mean= s that >>>>>>>>> Spark is internally inconsistent, in that a call through parquet = (which >>>>>>>>> depends on avro 1.8.0 [1]) may throw errors at runtime when it hi= ts avro >>>>>>>>> 1.7.7 on the classpath. Avro 1.8.0 is not binary compatible with= 1.7.7. >>>>>>>>> >>>>>>>>> [0] - https://issues.apache.org/jira/browse/SPARK-19697 >>>>>>>>> [1] - https://github.com/apache/parquet-mr/blob/apache- >>>>>>>>> parquet-1.8.2/pom.xml#L96 >>>>>>>>> >>>>>>>>> On Sun, Apr 30, 2017 at 3:28 AM, Sean Owen >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> I have one more issue that, if it needs to be fixed, needs to be >>>>>>>>>> fixed for 2.2.0. >>>>>>>>>> >>>>>>>>>> I'm fixing build warnings for the release and noticed that >>>>>>>>>> checkstyle actually complains there are some Java methods named = in >>>>>>>>>> TitleCase, like `ProcessingTimeTimeout`: >>>>>>>>>> >>>>>>>>>> https://github.com/apache/spark/pull/17803/files#r113934080 >>>>>>>>>> >>>>>>>>>> Easy enough to fix and it's right, that's not conventional. >>>>>>>>>> However I wonder if it was done on purpose to match a class name= ? >>>>>>>>>> >>>>>>>>>> I think this is one for @tdas >>>>>>>>>> >>>>>>>>>> On Thu, Apr 27, 2017 at 7:31 PM Michael Armbrust < >>>>>>>>>> michael@databricks.com> wrote: >>>>>>>>>> >>>>>>>>>>> Please vote on releasing the following candidate as Apache >>>>>>>>>>> Spark version 2.2.0. The vote is open until Tues, May 2nd, 2017 >>>>>>>>>>> at 12:00 PST and passes if a majority of at least 3 +1 PMC vote= s are >>>>>>>>>>> cast. >>>>>>>>>>> >>>>>>>>>>> [ ] +1 Release this package as Apache Spark 2.2.0 >>>>>>>>>>> [ ] -1 Do not release this package because ... >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> To learn more about Apache Spark, please see >>>>>>>>>>> http://spark.apache.org/ >>>>>>>>>>> >>>>>>>>>>> The tag to be voted on is v2.2.0-rc1 >>>>>>>>>>> ( >>>>>>>>>>> 8ccb4a57c82146c1a8f8966c7e64010cf5632cb6) >>>>>>>>>>> >>>>>>>>>>> List of JIRA tickets resolved can be found with this filter >>>>>>>>>>> >>>>>>>>>>> . >>>>>>>>>>> >>>>>>>>>>> The release files, including signatures, digests, etc. can be >>>>>>>>>>> found at: >>>>>>>>>>> http://home.apache.org/~pwendell/spark-releases/spark- >>>>>>>>>>> 2.2.0-rc1-bin/ >>>>>>>>>>> >>>>>>>>>>> Release artifacts are signed with the following key: >>>>>>>>>>> https://people.apache.org/keys/committer/pwendell.asc >>>>>>>>>>> >>>>>>>>>>> The staging repository for this release can be found at: >>>>>>>>>>> https://repository.apache.org/content/repositories/ >>>>>>>>>>> orgapachespark-1235/ >>>>>>>>>>> >>>>>>>>>>> The documentation corresponding to this release can be found at= : >>>>>>>>>>> http://people.apache.org/~pwendell/spark-releases/spark- >>>>>>>>>>> 2.2.0-rc1-docs/ >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> *FAQ* >>>>>>>>>>> >>>>>>>>>>> *How can I help test this release?* >>>>>>>>>>> >>>>>>>>>>> If you are a Spark user, you can help us test this release by >>>>>>>>>>> taking an existing Spark workload and running on this release c= andidate, >>>>>>>>>>> then reporting any regressions. >>>>>>>>>>> >>>>>>>>>>> *What should happen to JIRA tickets still targeting 2.2.0?* >>>>>>>>>>> >>>>>>>>>>> Committers should look at those and triage. Extremely important >>>>>>>>>>> bug fixes, documentation, and API tweaks that impact compatibil= ity should >>>>>>>>>>> be worked on immediately. Everything else please retarget to 2.= 3.0 or 2.2.1. >>>>>>>>>>> >>>>>>>>>>> *But my bug isn't fixed!??!* >>>>>>>>>>> >>>>>>>>>>> In order to make timely releases, we will typically not hold th= e >>>>>>>>>>> release unless the bug in question is a regression from 2.1.1. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Ryan Blue >>>>>>> Software Engineer >>>>>>> Netflix >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Ryan Blue >>>>>> Software Engineer >>>>>> Netflix >>>>>> >>>>>> >>>>>> >>>>> >>>> >>>> >>>> -- >>>> Ryan Blue >>>> Software Engineer >>>> Netflix >>>> >>> >>> >> >> >> -- >> Ryan Blue >> Software Engineer >> Netflix >> >> >> --f403045fbf9a3e5f1a054ea741b3 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
I'm going to -1 this given the number of small bug fix= es that have gone into the release branch.=C2=A0 I'll follow with anoth= er RC shortly.

On Tue, May 2, 2017 at 7:35 AM, Nick Pentreath <nick.pentreath@g= mail.com> wrote:
I won't=C2=A0+1 just given that it seems certain there will be= another RC and there are the outstanding ML QA blocker issues.

But clean build and test for JVM and Python tests LGTM on CentOS Li= nux 7.2.1511, OpenJDK 1.8.0_111


On Mon, 1 May 2017 at 22:42 Frank Austin Not= haft <fnothaf= t@berkeley.edu> wrote:
Hi Ryan,

IMO, the problem= is that the Spark Avro version conflicts with the Parquet Avro version. As= discussed upthread, I don=E2=80=99t think there=E2=80=99s a way to reli= ably=C2=A0make sure that Avro 1.8 is on the classpath first while using= spark-submit. Relocating avro in our project wouldn=E2=80=99t solve the pr= oblem, because the MethodNotFoundError is thrown from the internals of the = ParquetAvroOutputFormat, not from code in our project.

=
Regards,
On May 1, 2017, at 12:33 PM, Ryan Blue <rblue@netflix.com> wrote:
Michael, I think that the problem is = with your classpath.

Spark has a dependency to 1.7= .7, which can't be changed. Your project is what pulls in parquet-avro = and transitively Avro 1.8. Spark has no runtime dependency on Avro 1.8. It = is understandably annoying that using the same version of Parquet for your = parquet-avro dependency is what causes your project to depend on Avro 1.8, = but Spark's dependencies aren't a problem because its Parquet depen= dency doesn't bring in Avro.

There are a few ways a= round this:
1. Make sure Avro 1.8 is found in the classpath first
=
2. Shade Avro 1.8 in your project (assuming Avro classes aren't sh= ared)
3. Use parquet-avro 1.8.1 in your project, which I think sh= ould work with 1.8.2 and avoid the Avro change

The= work-around in Spark is for tests, which do use parquet-avro. We can look = at a Parquet 1.8.3 that avoids this issue, but I think this is reasonable f= or the 2.2.0 release.

rb

On Mon, May 1, 2017 at 12:08 PM= , Michael Heuer <heuermh@gmail.com> wrote:
Please excuse me if I'm misunder= standing -- the problem is not with our library or our classpath.

Th= ere is a conflict within Spark itself, in that Parquet 1.8.2 expects to fin= d Avro 1.8.0 on the runtime classpath and sees 1.7.7 instead.=C2=A0 Spark a= lready has to work around this for unit tests to pass.


On Mon, May 1, 2017 at 2:00= PM, Ryan Blue <rblue@netflix.com> wrote:
Thanks for the extra context, Frank. I agr= ee that it sounds like your problem comes from the conflict between your Ja= rs and what comes with Spark. Its the same concern that makes everyone shud= der when anything has a public dependency on Jackson. :)

What we usually do to get around situations like this is to relocate the p= roblem library inside the shaded Jar. That way, Spark uses its version of A= vro and your classes use a different version of Avro. This works if you don= 't need to share classes between the two. Would that work for your situ= ation?

rb

On Mon, May 1, 2017 at 11:55 AM, Koert Kuipers <koer= t@tresata.com> wrote:
sounds like you are running into the fact that you cannot really = put your classes before spark's on classpath? spark's switches to s= upport this never really worked for me either.

inability to cont= rol the classpath + inconsistent jars =3D> trouble ?

On Mon, May 1, 2017 at 2:36 PM, Fr= ank Austin Nothaft <fnothaft@berkeley.edu> wrote:
Hi Ryan,
We do set Avro to 1.8 in our downstream project. We also s= et Spark as a provided dependency, and build an =C3=BCberjar. We run via sp= ark-submit, which builds the classpath with our =C3=BCberjar and all of the= Spark deps. This leads to avro 1.7.1 getting picked off of the classpath a= t runtime, which causes the no such method exception to occur.
Regards,


On May 1, 2017, at 11:31 AM, Ryan Blue <rblue@netflix.com> = wrote:

Frank,

The issue you're running into is caused by using parquet-avro with A= vro 1.7. Can't your downstream project set the Avro dependency to 1.8? = Spark can't update Avro because it is a breaking change that would forc= e users to rebuilt specific Avro classes in some cases. But you should be f= ree to use Avro 1.8 to avoid the problem.

On Mon, May 1, 2017 at 11:08 AM, Frank = Austin Nothaft <fnothaft@berkeley.edu> wrote:
Hi Ryan et al,<= div>
The issue we=E2=80=99ve seen using a build of the Spark = 2.2.0 branch from a downstream project is that parquet-avro uses one of the= new Avro 1.8.0 methods, and you get a NoSuchMethodError since Spark puts A= vro 1.7.7 as a dependency. My colleague Michael (who posted earlier on this= thread) documented this in=C2=A0Spark-19697. I know that Spark ha= s unit tests that check this compatibility issue, but it looks like there w= as a=C2=A0recent change that sets a test= scope dependency on Avro 1.8.0, which masks this issue in the unit tes= ts. With this error, you can=E2=80=99t use the ParquetAvroOutputFormat from= a application running on Spark 2.2.0.

Regards,


On May 1, 2017, at 10:02 AM, Ryan B= lue <rblu= e@netflix.com.INVALID> wrote:

I agree with Sean. Spark only pulls in parquet-= avro for tests. For execution, it implements the record materialization API= s in Parquet to go directly to Spark SQL rows. This doesn't actually le= ak an Avro 1.8 dependency into Spark as far as I can tell.

rb

= On Mon, May 1, 2017 at 8:34 AM, Sean Owen <sowen@cloudera.com> wrote:
See discussio= n at=C2=A0https://github.com/apache/spark/pull/17163=C2=A0-- I think= the issue is that fixing this trades one problem for a slightly bigger one= .


= On Mon, May 1, 2017 at 4:13 PM Michael Heuer <heuermh@gmail.com> wrote:
Version 2.2.0 bumps the depende= ncy version for parquet to 1.8.2 but does not bump the dependency version f= or avro (currently at 1.7.7).=C2=A0 Though perhaps not clear from the issue= I reported [0], this means that Spark is internally inconsistent, in that = a call through parquet (which depends on avro 1.8.0 [1]) may throw errors a= t runtime when it hits avro 1.7.7 on the classpath.=C2=A0 Avro 1.8.0 is not= binary compatible with 1.7.7.

[0] - https://issues.apache.org/= jira/browse/SPARK-19697
[1] - htt= ps://github.com/apache/parquet-mr/blob/apache-parquet-1.8.2/pom.x= ml#L96

On Sun, Apr 30, 2017 at 3:28 AM, Sean Owen <sowen@cloudera.com>= wrote:
I have on= e more issue that, if it needs to be fixed, needs to be fixed for 2.2.0.
I'm fixing build warnings for the release and noticed = that checkstyle actually complains there are some Java methods named in Tit= leCase, like `ProcessingTimeTimeout`:



I think this is one for @tdas

Please=C2=A0vote=C2=A0on releasing the fol= lowing candidate as Apache Spark version 2.2.0. The=C2=A0vote=C2=A0is open until Tues, May 2nd, 2017 a= t 12:00 PST and passes if a majority of at least 3 +1 PMC=C2=A0votes=C2=A0are cast.

[ ] +1 Rele= ase this package as Apache Spark 2.2.0
[ ] -1 Do not release this package because ...


To learn more about Apache Spark, please see=C2=A0http://spark.apache.org/

The tag to be=C2=A0voted=C2=A0on is=C2=A0v2.2.0-rc1=C2=A0(8ccb4a57c82146c1a8f8966c7e64010cf5632cb6)

List of JIRA tickets= resolved can be found=C2=A0with this filter.

The release files, including s= ignatures, digests, etc. can be found at:
Release artifacts are signed with t= he following key:

The staging reposit= ory for this release can be found at:
=
The documentation corresponding t= o this release can be found at:


FAQ

How can I help test this release?

If y= ou are a Spark user, you can help us test this release by taking an existin= g Spark workload and running on this release candidate, then reporting any = regressions.

What should happen to JIRA tickets still targeti= ng 2.2.0?

Committers should look at those and triage. Extrem= ely important bug fixes, documentation, and API tweaks that impact compatib= ility should be worked on immediately. Everything else please retarget to 2= .3.0 or 2.2.1.

But my bug isn't fixed!??!

In order to ma= ke timely releases, we will typically not hold the release unless the bug i= n question is a regression from 2.1.1.




--
Ryan Blue
Software Engineer
Netflix



--
Ryan Blue
Software Eng= ineer
Netflix
=




--
=
Ryan Blue
Software Engineer
Netflix




--
=
Rya= n Blue
Software Engineer
Net= flix


--f403045fbf9a3e5f1a054ea741b3--