hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gautam <gautamkows...@gmail.com>
Subject Re: Hive Metastore Bottleneck
Date Wed, 30 Mar 2016 22:20:52 GMT
The metastore service is a java process that is a thrift server .. so you
can point multiple such hive metastore instances with
"javax.jdo.option.ConnectionURL" poitning to the same mysql db.

On Wed, Mar 30, 2016 at 3:11 PM, Mich Talebzadeh <mich.talebzadeh@gmail.com>
wrote:

>
>
> Can you clarify this please
>
> "Have you tried putting multiple metastores behind a load balancer"
>
> Are you implying that metastore and backend DB are different entities here.
>
> As far as I know $HIVE_HOME/bin/hive --service metastore & starts Hive
> threads to the backend database/metastore and Hive server2 acts a gateway
> for remote access to Hive metastore through beeline or other clients
>
> There is only one metastore here namely MySQL/Oracle or others.
>
> Thanks
>
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>
> On 30 March 2016 at 22:53, Gautam <gautamkowshik@gmail.com> wrote:
>
>> Can you elaborate on where you see the bottleneck?   A general overview
>> of your access path would be useful. For instance if you'r accessing Hive
>> metastore via HiveServer2 or from webhcat using embedded cli or something
>> else.
>>
>> Have you tried putting multiple metastores behind a load balancer? It's
>> just a thrift service over mysql so can have multiple instances pointing to
>> same backend db.
>>
>> On Wed, Mar 30, 2016 at 2:28 PM, Udit Mehta <umehta@groupon.com> wrote:
>>
>>> Hi all,
>>>
>>> We are currently running Hive in production and staging with the
>>> metastore connecting to a MySql database in the backend. The traffic in
>>> production accessing the metastore is more than staging which is expected.
>>> We have had a sudden increase in traffic which has led to the metastore
>>> operation taking a lot longer than before. The same query on staging takes
>>> a lot less due to the lesser traffic on the staging cluster.
>>>
>>> We tried increasing the heap space for the metastore process as well as
>>> bumped up the memory for the mysql database. Both these changes did not
>>> seem to help much and we still see delays. Is there any other config we can
>>> increase to counter this increased traffic? I am looking at config for max
>>> threads as well but im not sure if this is the right path ahead.
>>>
>>> Im wondering if the metastore is a bottleneck here or im missing
>>> something.
>>>
>>> Looking forward to your reply,
>>> Udit
>>>
>>
>>
>>
>> --
>> "If you really want something in this life, you have to work for it. Now,
>> quiet! They're about to announce the lottery numbers..."
>>
>
>


-- 
"If you really want something in this life, you have to work for it. Now,
quiet! They're about to announce the lottery numbers..."

Mime
View raw message