spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Working Formula for Hive 0.13?
Date Wed, 30 Jul 2014 23:40:41 GMT
I found SPARK-2706

Let me attach tentative patch there - I still face compilation error.

Cheers


On Mon, Jul 28, 2014 at 5:59 PM, Ted Yu <yuzhihong@gmail.com> wrote:

> bq. Either way its unclear to if there is any reason to use reflection to
> support multiple versions, instead of just upgrading to Hive 0.13.0
>
> Which Spark release would this Hive upgrade take place ?
> I agree it is cleaner to upgrade Hive dependency vs. introducing
> reflection.
>
> Cheers
>
>
> On Mon, Jul 28, 2014 at 5:22 PM, Michael Armbrust <michael@databricks.com>
> wrote:
>
>> A few things:
>>  - When we upgrade to Hive 0.13.0, Patrick will likely republish the
>> hive-exec jar just as we did for 0.12.0
>>  - Since we have to tie into some pretty low level APIs it is unsurprising
>> that the code doesn't just compile out of the box against 0.13.0
>>  - ScalaReflection is for determining Schema from Scala classes, not
>> reflection based bridge code.  Either way its unclear to if there is any
>> reason to use reflection to support multiple versions, instead of just
>> upgrading to Hive 0.13.0
>>
>> One question I have is, What is the goal of upgrading to hive 0.13.0?  Is
>> it purely because you are having problems connecting to newer metastores?
>>  Are there some features you are hoping for?  This will help me prioritize
>> this effort.
>>
>> Michael
>>
>>
>> On Mon, Jul 28, 2014 at 4:05 PM, Ted Yu <yuzhihong@gmail.com> wrote:
>>
>> > I was looking for a class where reflection-related code should reside.
>> >
>> > I found this but don't think it is the proper class for bridging
>> > differences between hive 0.12 and 0.13.1:
>> >
>> >
>> sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala
>> >
>> > Cheers
>> >
>> >
>> > On Mon, Jul 28, 2014 at 3:41 PM, Ted Yu <yuzhihong@gmail.com> wrote:
>> >
>> > > After manually copying hive 0.13.1 jars to local maven repo, I got the
>> > > following errors when building spark-hive_2.10 module :
>> > >
>> > > [ERROR]
>> > >
>> >
>> /homes/xx/spark/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala:182:
>> > > type mismatch;
>> > >  found   : String
>> > >  required: Array[String]
>> > > [ERROR]       val proc: CommandProcessor =
>> > > CommandProcessorFactory.get(tokens(0), hiveconf)
>> > > [ERROR]
>> > >    ^
>> > > [ERROR]
>> > >
>> >
>> /homes/xx/spark/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala:60:
>> > > value getAllPartitionsForPruner is not a member of org.apache.
>> > >  hadoop.hive.ql.metadata.Hive
>> > > [ERROR]         client.getAllPartitionsForPruner(table).toSeq
>> > > [ERROR]                ^
>> > > [ERROR]
>> > >
>> >
>> /homes/xx/spark/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala:267:
>> > > overloaded method constructor TableDesc with alternatives:
>> > >   (x$1: Class[_ <: org.apache.hadoop.mapred.InputFormat[_, _]],x$2:
>> > > Class[_],x$3:
>> > java.util.Properties)org.apache.hadoop.hive.ql.plan.TableDesc
>> > > <and>
>> > >   ()org.apache.hadoop.hive.ql.plan.TableDesc
>> > >  cannot be applied to
>> (Class[org.apache.hadoop.hive.serde2.Deserializer],
>> > > Class[(some other)?0(in value tableDesc)(in value tableDesc)],
>> > Class[?0(in
>> > > value tableDesc)(in   value tableDesc)], java.util.Properties)
>> > > [ERROR]   val tableDesc = new TableDesc(
>> > > [ERROR]                   ^
>> > > [WARNING] Class org.antlr.runtime.tree.CommonTree not found -
>> continuing
>> > > with a stub.
>> > > [WARNING] Class org.antlr.runtime.Token not found - continuing with a
>> > stub.
>> > > [WARNING] Class org.antlr.runtime.tree.Tree not found - continuing
>> with a
>> > > stub.
>> > > [ERROR]
>> > >      while compiling:
>> > >
>> >
>> /homes/xx/spark/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveQl.scala
>> > >         during phase: typer
>> > >      library version: version 2.10.4
>> > >     compiler version: version 2.10.4
>> > >
>> > > The above shows incompatible changes between 0.12 and 0.13.1
>> > > e.g. the first error corresponds to the following method
>> > > in CommandProcessorFactory :
>> > >   public static CommandProcessor get(String[] cmd, HiveConf conf)
>> > >
>> > > Cheers
>> > >
>> > >
>> > > On Mon, Jul 28, 2014 at 1:32 PM, Steve Nunez <snunez@hortonworks.com>
>> > > wrote:
>> > >
>> > >> So, do we have a short-term fix until Hive 0.14 comes out? Perhaps
>> > adding
>> > >> the hive-exec jar to the spark-project repo? It doesn¹t look like
>> > there¹s
>> > >> a release date schedule for 0.14.
>> > >>
>> > >>
>> > >>
>> > >> On 7/28/14, 10:50, "Cheng Lian" <lian.cs.zju@gmail.com> wrote:
>> > >>
>> > >> >Exactly, forgot to mention Hulu team also made changes to cope
with
>> > those
>> > >> >incompatibility issues, but they said that¹s relatively easy once
>> the
>> > >> >re-packaging work is done.
>> > >> >
>> > >> >
>> > >> >On Tue, Jul 29, 2014 at 1:20 AM, Patrick Wendell <
>> pwendell@gmail.com>
>> > >>
>> > >> >wrote:
>> > >> >
>> > >> >> I've heard from Cloudera that there were hive internal changes
>> > between
>> > >> >> 0.12 and 0.13 that required code re-writing. Over time it
might be
>> > >> >> possible for us to integrate with hive using API's that are
more
>> > >> >> stable (this is the domain of Michael/Cheng/Yin more than
me!). It
>> > >> >> would be interesting to see what the Hulu folks did.
>> > >> >>
>> > >> >> - Patrick
>> > >> >>
>> > >> >> On Mon, Jul 28, 2014 at 10:16 AM, Cheng Lian <
>> lian.cs.zju@gmail.com>
>> > >> >> wrote:
>> > >> >> > AFAIK, according a recent talk, Hulu team in China has
built
>> Spark
>> > >> SQL
>> > >> >> > against Hive 0.13 (or 0.13.1?) successfully. Basically
they also
>> > >> >> > re-packaged Hive 0.13 as what the Spark team did. The
slides of
>> the
>> > >> >>talk
>> > >> >> > hasn't been released yet though.
>> > >> >> >
>> > >> >> >
>> > >> >> > On Tue, Jul 29, 2014 at 1:01 AM, Ted Yu <yuzhihong@gmail.com>
>> > wrote:
>> > >> >> >
>> > >> >> >> Owen helped me find this:
>> > >> >> >> https://issues.apache.org/jira/browse/HIVE-7423
>> > >> >> >>
>> > >> >> >> I guess this means that for Hive 0.14, Spark should
be able to
>> > >> >>directly
>> > >> >> >> pull in hive-exec-core.jar
>> > >> >> >>
>> > >> >> >> Cheers
>> > >> >> >>
>> > >> >> >>
>> > >> >> >> On Mon, Jul 28, 2014 at 9:55 AM, Patrick Wendell
<
>> > >> pwendell@gmail.com>
>> > >> >> >> wrote:
>> > >> >> >>
>> > >> >> >> > It would be great if the hive team can fix that
issue. If
>> not,
>> > >> >>we'll
>> > >> >> >> > have to continue forking our own version of
Hive to change
>> the
>> > way
>> > >> >>it
>> > >> >> >> > publishes artifacts.
>> > >> >> >> >
>> > >> >> >> > - Patrick
>> > >> >> >> >
>> > >> >> >> > On Mon, Jul 28, 2014 at 9:34 AM, Ted Yu <yuzhihong@gmail.com
>> >
>> > >> >>wrote:
>> > >> >> >> > > Talked with Owen offline. He confirmed
that as of 0.13,
>> > >> >>hive-exec is
>> > >> >> >> > still
>> > >> >> >> > > uber jar.
>> > >> >> >> > >
>> > >> >> >> > > Right now I am facing the following error
building against
>> > Hive
>> > >> >> 0.13.1
>> > >> >> >> :
>> > >> >> >> > >
>> > >> >> >> > > [ERROR] Failed to execute goal on project
spark-hive_2.10:
>> > Could
>> > >> >>not
>> > >> >> >> > > resolve dependencies for project
>> > >> >> >> > > org.apache.spark:spark-hive_2.10:jar:1.1.0-SNAPSHOT:
The
>> > >> >>following
>> > >> >> >> > > artifacts could not be resolved:
>> > >> >> >> > > org.spark-project.hive:hive-metastore:jar:0.13.1,
>> > >> >> >> > > org.spark-project.hive:hive-exec:jar:0.13.1,
>> > >> >> >> > > org.spark-project.hive:hive-serde:jar:0.13.1:
Failure to
>> find
>> > >> >> >> > > org.spark-project.hive:hive-metastore:jar:0.13.1
in
>> > >> >> >> > > http://repo.maven.apache.org/maven2 was
cached in the
>> local
>> > >> >> >> repository,
>> > >> >> >> > > resolution will not be reattempted until
the update
>> interval
>> > of
>> > >> >> >> > maven-repo
>> > >> >> >> > > has elapsed or updates are forced ->
[Help 1]
>> > >> >> >> > >
>> > >> >> >> > > Some hint would be appreciated.
>> > >> >> >> > >
>> > >> >> >> > > Cheers
>> > >> >> >> > >
>> > >> >> >> > >
>> > >> >> >> > > On Mon, Jul 28, 2014 at 9:15 AM, Sean Owen
<
>> > sowen@cloudera.com>
>> > >> >> wrote:
>> > >> >> >> > >
>> > >> >> >> > >> Yes, it is published. As of previous
versions, at least,
>> > >> >>hive-exec
>> > >> >> >> > >> included all of its dependencies *in
its artifact*,
>> making it
>> > >> >> unusable
>> > >> >> >> > >> as-is because it contained copies of
dependencies that
>> clash
>> > >> >>with
>> > >> >> >> > >> versions present in other artifacts,
and can't be managed
>> > with
>> > >> >> Maven
>> > >> >> >> > >> mechanisms.
>> > >> >> >> > >>
>> > >> >> >> > >> I am not sure why hive-exec was not
published normally,
>> with
>> > >> >>just
>> > >> >> its
>> > >> >> >> > >> own classes. That's why it was copied,
into an artifact
>> with
>> > >> >>just
>> > >> >> >> > >> hive-exec code.
>> > >> >> >> > >>
>> > >> >> >> > >> You could do the same thing for hive-exec
0.13.1.
>> > >> >> >> > >> Or maybe someone knows that it's published
more 'normally'
>> > now.
>> > >> >> >> > >> I don't think hive-metastore is related
to this question?
>> > >> >> >> > >>
>> > >> >> >> > >> I am no expert on the Hive artifacts,
just remembering
>> what
>> > the
>> > >> >> issue
>> > >> >> >> > >> was initially in case it helps you
get to a similar
>> solution.
>> > >> >> >> > >>
>> > >> >> >> > >> On Mon, Jul 28, 2014 at 4:47 PM, Ted
Yu <
>> yuzhihong@gmail.com
>> > >
>> > >> >> wrote:
>> > >> >> >> > >> > hive-exec (as of 0.13.1) is published
here:
>> > >> >> >> > >> >
>> > >> >> >> > >>
>> > >> >> >> >
>> > >> >> >>
>> > >> >>
>> > >> >>
>> > >>
>> >
>> http://search.maven.org/#artifactdetails%7Corg.apache.hive%7Chive-exec%7C
>> > >> >>0.13.1%7Cjar
>> > >> >> >> > >> >
>> > >> >> >> > >> > Should a JIRA be opened so that
dependency on
>> > hive-metastore
>> > >> >>can
>> > >> >> be
>> > >> >> >> > >> > replaced by dependency on hive-exec
?
>> > >> >> >> > >> >
>> > >> >> >> > >> > Cheers
>> > >> >> >> > >> >
>> > >> >> >> > >> >
>> > >> >> >> > >> > On Mon, Jul 28, 2014 at 8:26 AM,
Sean Owen
>> > >> >><sowen@cloudera.com>
>> > >> >> >> > wrote:
>> > >> >> >> > >> >
>> > >> >> >> > >> >> The reason for org.spark-project.hive
is that Spark
>> relies
>> > >> on
>> > >> >> >> > >> >> hive-exec, but the Hive project
does not publish this
>> > >> >>artifact
>> > >> >> by
>> > >> >> >> > >> >> itself, only with all its
dependencies as an uber jar.
>> > Maybe
>> > >> >> that's
>> > >> >> >> > >> >> been improved. If so, you
need to point at the new
>> > hive-exec
>> > >> >>and
>> > >> >> >> > >> >> perhaps sort out its dependencies
manually in your
>> build.
>> > >> >> >> > >> >>
>> > >> >> >> > >> >> On Mon, Jul 28, 2014 at 4:01
PM, Ted Yu <
>> > >> yuzhihong@gmail.com>
>> > >> >> >> wrote:
>> > >> >> >> > >> >> > I found 0.13.1 artifacts
in maven:
>> > >> >> >> > >> >> >
>> > >> >> >> > >> >>
>> > >> >> >> > >>
>> > >> >> >> >
>> > >> >> >>
>> > >> >>
>> > >> >>
>> > >>
>> >
>> http://search.maven.org/#artifactdetails%7Corg.apache.hive%7Chive-metasto
>> > >> >>re%7C0.13.1%7Cjar
>> > >> >> >> > >> >> >
>> > >> >> >> > >> >> > However, Spark uses groupId
of
>> org.spark-project.hive,
>> > not
>> > >> >> >> > >> >> org.apache.hive
>> > >> >> >> > >> >> >
>> > >> >> >> > >> >> > Can someone tell me how
it is supposed to work ?
>> > >> >> >> > >> >> >
>> > >> >> >> > >> >> > Cheers
>> > >> >> >> > >> >> >
>> > >> >> >> > >> >> >
>> > >> >> >> > >> >> > On Mon, Jul 28, 2014
at 7:44 AM, Steve Nunez <
>> > >> >> >> > snunez@hortonworks.com>
>> > >> >> >> > >> >> wrote:
>> > >> >> >> > >> >> >
>> > >> >> >> > >> >> >> I saw a note earlier,
perhaps on the user list,
>> that at
>> > >> >>least
>> > >> >> >> one
>> > >> >> >> > >> >> person is
>> > >> >> >> > >> >> >> using Hive 0.13.
Anyone got a working build
>> > configuration
>> > >> >>for
>> > >> >> >> this
>> > >> >> >> > >> >> version
>> > >> >> >> > >> >> >> of Hive?
>> > >> >> >> > >> >> >>
>> > >> >> >> > >> >> >> Regards,
>> > >> >> >> > >> >> >> - Steve
>> > >> >> >> > >> >> >>
>> > >> >> >> > >> >> >>
>> > >> >> >> > >> >> >>
>> > >> >> >> > >> >> >> --
>> > >> >> >> > >> >> >> CONFIDENTIALITY NOTICE
>> > >> >> >> > >> >> >> NOTICE: This message
is intended for the use of the
>> > >> >> individual
>> > >> >> >> or
>> > >> >> >> > >> >> entity to
>> > >> >> >> > >> >> >> which it is addressed
and may contain information
>> that
>> > is
>> > >> >> >> > >> confidential,
>> > >> >> >> > >> >> >> privileged and exempt
from disclosure under
>> applicable
>> > >> >>law.
>> > >> >> If
>> > >> >> >> the
>> > >> >> >> > >> >> reader
>> > >> >> >> > >> >> >> of this message is
not the intended recipient, you
>> are
>> > >> >>hereby
>> > >> >> >> > >> notified
>> > >> >> >> > >> >> that
>> > >> >> >> > >> >> >> any printing, copying,
dissemination, distribution,
>> > >> >> disclosure
>> > >> >> >> or
>> > >> >> >> > >> >> >> forwarding of this
communication is strictly
>> > prohibited.
>> > >> >>If
>> > >> >> you
>> > >> >> >> > have
>> > >> >> >> > >> >> >> received this communication
in error, please contact
>> > the
>> > >> >> sender
>> > >> >> >> > >> >> immediately
>> > >> >> >> > >> >> >> and delete it from
your system. Thank You.
>> > >> >> >> > >> >> >>
>> > >> >> >> > >> >>
>> > >> >> >> > >>
>> > >> >> >> >
>> > >> >> >>
>> > >> >>
>> > >>
>> > >>
>> > >>
>> > >> --
>> > >> CONFIDENTIALITY NOTICE
>> > >> NOTICE: This message is intended for the use of the individual or
>> entity
>> > >> to
>> > >> which it is addressed and may contain information that is
>> confidential,
>> > >> privileged and exempt from disclosure under applicable law. If the
>> > reader
>> > >> of this message is not the intended recipient, you are hereby
>> notified
>> > >> that
>> > >> any printing, copying, dissemination, distribution, disclosure or
>> > >> forwarding of this communication is strictly prohibited. If you have
>> > >> received this communication in error, please contact the sender
>> > >> immediately
>> > >> and delete it from your system. Thank You.
>> > >>
>> > >
>> > >
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message