hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mich Talebzadeh <mich.talebza...@gmail.com>
Subject Re: Spark support for update/delete operations on Hive ORC transactional tables
Date Wed, 22 Jun 2016 16:50:38 GMT
Hi Ajay,

I am afraid for now transaction heart beat do not work through Spark, so I
have no other solution.

This is interesting point as with Hive running on Spark engine there is no
issue with this as Hive handles the transactions,

I gather in simplest form Hive has to deal with its metadata for
transaction logic but Spark somehow cannot do that.

In short that is it. You need to do that through Hive.

Cheers,



Dr Mich Talebzadeh



LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com



On 22 June 2016 at 16:08, Ajay Chander <itschevva@gmail.com> wrote:

> Hi Mich,
>
> Right now I have a similar usecase where I have to delete some rows from a
> hive table. My hive table is of type ORC, Bucketed and included
> transactional property. I can delete from hive shell but not from my
> spark-shell or spark app. Were you able to find any work around? Thank
> you.
>
> Regards,
> Ajay
>
>
> On Thursday, June 2, 2016, Mich Talebzadeh <mich.talebzadeh@gmail.com>
> wrote:
>
>> thanks for that.
>>
>> I will have a look
>>
>> Dr Mich Talebzadeh
>>
>>
>>
>> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>>
>> On 2 June 2016 at 10:46, Elliot West <teabot@gmail.com> wrote:
>>
>>> Related to this, there exists an API in Hive to simplify the
>>> integrations of other frameworks with Hive's ACID feature:
>>>
>>> See:
>>> https://cwiki.apache.org/confluence/display/Hive/HCatalog+Streaming+Mutation+API
>>>
>>> It contains code for maintaining heartbeats, handling locks and
>>> transactions, and submitting mutations in a distributed environment.
>>>
>>> We have used it to write to transactional tables from Cascading based
>>> processes.
>>>
>>> Elliot.
>>>
>>>
>>> On 2 June 2016 at 09:54, Mich Talebzadeh <mich.talebzadeh@gmail.com>
>>> wrote:
>>>
>>>>
>>>> Hi,
>>>>
>>>> Spark does not support transactions because as I understand there is a
>>>> piece in the execution side that needs to send heartbeats to Hive metastore
>>>> saying a transaction is still alive". That has not been implemented in
>>>> Spark yet to my knowledge."
>>>>
>>>> Any idea on the timelines when we are going to have support for
>>>> transactions in Spark for Hive ORC tables. This will really be useful.
>>>>
>>>>
>>>> Thanks,
>>>>
>>>>
>>>> Dr Mich Talebzadeh
>>>>
>>>>
>>>>
>>>> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>>>
>>>>
>>>>
>>>> http://talebzadehmich.wordpress.com
>>>>
>>>>
>>>>
>>>
>>>
>>

Mime
View raw message