hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dr Mich Talebzadeh" <m...@peridale.co.uk>
Subject Re: Hive Concurrency support
Date Sun, 23 Aug 2015 11:32:40 GMT

correction in below:

2) You will have to coordinate concurrency via zookeeper for distributed
>transactions. Without zookeeper or equivalent product it will not work
>and you will end up with deadlocks in your metastore.

Should read

.. it will not work and you will end up with serialisation issues in your
metastore.

On 23/8/2015, "Dr Mich Talebzadeh" <mich@peridale.co.uk> wrote:

>
>Well I have across this in practice with real time data movements DML
>inserts) using replication server to deliver data from RDBMS to Hive. In
>general if you have not met the conditions you will end up with
>deadlocks.
>
>To make this work you will need:
>
>1) your Hive metastore must allow concurrency. In so far as I have found
>out Hive metastore on Oracle provides the best coincurrency support. For
>that you will need to run the supplied concurrency script against your
>metastore.
>2) You will have to coordinate concurrency via zookeeper for distributed
>transactions. Without zookeeper or equivalent product it will not work
>and you will end up with deadlocks in your metastore.
>
>HTH,
>
>Mich
>
>
>
>On 23/8/2015, "Noam Hasson" <noam.hasson@kenshoo.com> wrote:
>
>>If you are looking to support concurrency check this param:
>>https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.support.concurrency
>>
>>I believe it will allow to you run several different inserts to the same
>>partitions, but I don't know what kind of corruption/collisions scenarios
>>are possible.
>>
>>On Fri, Aug 21, 2015 at 9:02 PM, Suyog Parlikar <suyogparlikar@gmail.com>
>>wrote:
>>
>>> Thanks Elliot,
>>>
>>> For the immediate reply.
>>>
>>> But as per hive locking mechanism,
>>> While inserting data to a partition hive acquires exclusive lock on that
>>> partition and shared lock on the entire table.
>>>
>>> How is it possible to insert data into a different partition of the same
>>> table while having shared lock on the table which does not allow write
>>> operation.
>>>
>>> Please correct me if my understanding about the same is wrong.
>>> (I am using hql inserts only for these operations)
>>>
>>> Thanks,
>>> Suyog
>>> On Aug 21, 2015 7:28 PM, "Elliot West" <teabot@gmail.com> wrote:
>>>
>>>> I presume you mean "into different partitions of a table at the same
>>>> time"? This should be possible. It is certainly supported by the streaming
>>>> API, which is probably where you want to look if you need to insert large
>>>> volumes of data to multiple partitions concurrently. I can't see why it
>>>> would not also be possible with HQL INSERTs.
>>>>
>>>> On Friday, 21 August 2015, Suyog Parlikar <suyogparlikar@gmail.com>
>>>> wrote:
>>>>
>>>>> Can we insert data in different partitions of a table at a time.
>>>>>
>>>>> Waiting for inputs .
>>>>>
>>>>> Thanks in advance.
>>>>>
>>>>> - suyog
>>>>>
>>>>
>>
>>--
>>This e-mail, as well as any attached document, may contain material which
>>is confidential and privileged and may include trademark, copyright and
>>other intellectual property rights that are proprietary to Kenshoo Ltd,
>> its subsidiaries or affiliates ("Kenshoo"). This e-mail and its
>>attachments may be read, copied and used only by the addressee for the
>>purpose(s) for which it was disclosed herein. If you have received it in
>>error, please destroy the message and any attachment, and contact us
>>immediately. If you are not the intended recipient, be aware that any
>>review, reliance, disclosure, copying, distribution or use of the contents
>>of this message without Kenshoo's express permission is strictly prohibited.
>>
>>
>

Mime
View raw message