hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mich Talebzadeh <mich.talebza...@gmail.com>
Subject Re: Creating a Hive table through Spark and potential locking issue (a bug)
Date Thu, 09 Jun 2016 00:25:22 GMT
Hi,

Just to clarify I use Hive with Spark engine (default) so Hive on Spark
engine as we discussed and observed.

Now with regard to Spark (as an app NOT execution engine) doing the create
table in Hive and populating it, I don't think Spark itself does any
transactional enforcement. This means that Spark assumes no concurrency
for Hive table. It is probably the same reason why updates/deletes to Hive
ORC transactional tables through Spark fail.

HTH

Dr Mich Talebzadeh



LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com



On 9 June 2016 at 00:46, Eugene Koifman <ekoifman@hortonworks.com> wrote:

> Locks in Hive are acquired by the query complier and should be independent
> of the execution engine.
> Having said that, I’ve not tried this on Spark, so my answer is only
> accurate with Hive.
>
> Eugene
>
>
> From: Michael Segel <msegel_hadoop@hotmail.com>
> Reply-To: "user@hive.apache.org" <user@hive.apache.org>
> Date: Wednesday, June 8, 2016 at 3:42 PM
> To: "user@hive.apache.org" <user@hive.apache.org>
> Cc: David Newberger <david.newberger@wandcorp.com>, "user @spark" <
> user@spark.apache.org>
> Subject: Re: Creating a Hive table through Spark and potential locking
> issue (a bug)
>
>
> On Jun 8, 2016, at 3:35 PM, Eugene Koifman <ekoifman@hortonworks.com>
> wrote:
>
> if you split “create table test.dummy as select * from oraclehadoop.dummy;
> ”
> into create table statement, followed by insert into test.dummy as select…
> you should see the behavior you expect with Hive.
> Drop statement will block while insert is running.
>
> Eugene
>
>
> OK, assuming true…
>
> Then the ddl statement is blocked because Hive sees the table in use.
>
> If you can confirm this to be the case, and if you can confirm the same
> for spark and then you can drop the table while spark is running, then you
> would have a bug since Spark in the hive context doesn’t set any locks or
> improperly sets locks.
>
> I would have to ask which version of hive did you build spark against?
> That could be another factor.
>
> HTH
>
> -Mike
>
>
>

Mime
View raw message