hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christian Kurz <crk...@gmx.de>
Subject Re: Understanding distributed Hive server and Hive Metastore setup
Date Fri, 26 Aug 2011 09:13:10 GMT
Ashutosh,

thank you for the explanation. I have changed the setup from embedded 
Derby to stand-alone Derby. Neither the hiveserver nor the metastore 
service open any Derby files any longer and things are working fine.

Thanks again for your help,
Christian

On 26.08.2011 01:00, Ashutosh Chauhan wrote:
> Christian,
>
> Looks like its not possible to do the setup that you are looking for. 
> Problem arises since HiveServer extends HMSHandler directly instead of 
> accessing Metastore through HiveMetaStoreClient and because of this 
> metastore thrift interface is missed entirely. Hiveserver will contact 
> mysql directly and won't go through external metastore service as you 
> have in your diagram.  If you consider this as a blocker, please open 
> up a jira for more discussion.
>
> Hope it helps,
> Ashutosh
>
> On Wed, Aug 24, 2011 at 23:21, Christian Kurz <crkurz@gmx.de 
> <mailto:crkurz@gmx.de>> wrote:
>
>     Thanks, Edward and Ashutosh
>
>     Ashutosh,
>     yes, I do not understand why the service "hiveserver" still uses a
>     Derby instance even through it should be talking to the service
>     "metastore". Btw, if I run the hiveserver without having started
>     the metastore service, the hiveserver complains when I try to let
>     it execute a HiveQL command through JDBC:
>
>     ...
>     org.apache.hadoop.hive.ql.metadata.HiveException:
>     MetaException(message:Could not connect to meta store using any of
>     the URIs provided)
>             at
>     org.apache.hadoop.hive.ql.metadata.Hive.getTablesByPattern(Hive.java:919)
>     ...
>     (full stacktrace at the end of this post)
>
>     which is exactly what I expect and which makes me somewhat
>     confident that I have configured things correctly.
>
>     The entire issue came up, because the hiveserver service did not
>     work, when started from the same directory, from which the
>     metastore service had been started. It turned out that this was
>     because both services were trying to setup a Derby instance in the
>     current dir and therefore ran into a file locking situation. I
>     have worked around this by starting the two services from
>     different directories, but I am worried that I'd be missing an
>     important point in my setup.
>
>     When I run "pfiles <pid of hiveserver>" it lists these files for
>     the hiveserver service (which should not need a Derby instance, as
>     far as I understood):
>           ...tons of jars...
>           /home/hadoop/hive_admin/derby.log
>           /home/hadoop/hive_admin/metastore_db/log/log1.dat
>           /home/hadoop/hive_admin/metastore_db/dbex.lck
>           /home/hadoop/hive_admin/metastore_db/seg0/c191.dat
>           /home/hadoop/hive_admin/metastore_db/seg0/c1a1.dat
>           ...
>           /home/hadoop/hive_admin/metastore_db/seg0/c431.dat
>           /home/hadoop/hive_admin/metastore_db/seg0/c451.dat
>
>     Any pointers appreciated. If anybody things this is a bug, I can
>     file one.
>
>     Thanks,
>     Christian
>
>
>     full stacktrace:
>
>     Hive history
>     file=/tmp/hadoop/hive_job_log_hadoop_201108242305_155100916.txt
>     FAILED: Error in semantic analysis: Table not found weblog
>     org.apache.hadoop.hive.ql.metadata.HiveException:
>     MetaException(message:Could not connect to meta store using any of
>     the URIs provided)
>             at
>     org.apache.hadoop.hive.ql.metadata.Hive.getTablesByPattern(Hive.java:919)
>             at
>     org.apache.hadoop.hive.ql.metadata.Hive.getTablesByPattern(Hive.java:904)
>             at
>     org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeCreateTable(SemanticAnalyzer.java:7074)
>             at
>     org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:6573)
>             at
>     org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:238)
>             at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:340)
>             at org.apache.hadoop.hive.ql.Driver.run(Driver.java:736)
>             at
>     org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:116)
>             at
>     org.apache.hadoop.hive.service.ThriftHive$Processor$execute.process(ThriftHive.java:699)
>             at
>     org.apache.hadoop.hive.service.ThriftHive$Processor.process(ThriftHive.java:677)
>             at
>     org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:253)
>             at
>     java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>             at
>     java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>             at java.lang.Thread.run(Thread.java:619)
>     Caused by: MetaException(message:Could not connect to meta store
>     using any of the URIs provided)
>             at
>     org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:183)
>             at
>     org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:151)
>             at
>     org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:1855)
>             at
>     org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:1865)
>             at
>     org.apache.hadoop.hive.ql.metadata.Hive.getTablesByPattern(Hive.java:917)
>             ... 13 more
>     FAILED: Error in metadata: MetaException(message:Could not connect
>     to meta store using any of the URIs provided)
>     FAILED: Execution Error, return code 1 from
>     org.apache.hadoop.hive.ql.exec.DDLTask
>
>
>
>
>     On 25.08.2011 01 <tel:25.08.2011%2001>:29, Ashutosh Chauhan wrote:
>>     Edward,
>>
>>     Apart from recommended best practices what Christian is asking
>>     for is why HiveServer is still trying to interact with local db
>>     instance even after setting the config variables. AFAIK it should
>>     not. Christian, you found that out by looking at files opened by
>>     HiveServer jvm. Can you provide more info there like how did you
>>     find that out and which these files are?
>>
>>     Ashutosh
>>
>>     On Wed, Aug 24, 2011 at 14:20, Edward Capriolo
>>     <edlinuxguru@gmail.com <mailto:edlinuxguru@gmail.com>> wrote:
>>
>>
>>
>>         On Wed, Aug 24, 2011 at 3:02 PM, Christian Kurz
>>         <crkurz@gmx.de <mailto:crkurz@gmx.de>> wrote:
>>
>>
>>             Thanks for the quick reply, Edward
>>
>>             I am not sure I got you: My HiveService has been started
>>             with hive.metastore.local=false. So shouldn't it use
>>             thrift instead of its own local Derby instance?
>>
>>             Thanks,
>>             Christian
>>
>>             Am 24.08.2011 um 19:33 schrieb Edward Capriolo
>>             <edlinuxguru@gmail.com <mailto:edlinuxguru@gmail.com>>:
>>
>>>
>>>
>>>             On Wed, Aug 24, 2011 at 10:53 AM, Christian Kurz
>>>             <crkurz@gmx.de <mailto:crkurz@gmx.de>> wrote:
>>>
>>>                 Greetings,
>>>
>>>                 could somebody confirm/correct my understanding of a
>>>                 fully distributed Hive setup, please?
>>>
>>>                 My setup is as follows
>>>
>>>                     * *Java application using Hive JDBC driver
>>>                       *connects to
>>>                     * *hive --service hiveserver*, which connects to
>>>                     * *hive --service metastore*, which uses an
>>>                       embedded Derby database for metadata storage
>>>
>>>                 Please find more details in the image attached.
>>>
>>>                 The thing I find confusing is that JVM2 (Hive
>>>                 Server) starts up a Derby database instance. I can
>>>                 see that from the files the JVM has opened.
>>>
>>>                 Does anybody know, why the Hive Server needs a Derby
>>>                 instance even though hive-site.xml says:
>>>                 hive.metastore.local=false ?
>>>
>>>                 Any hints are much appreciated.
>>>
>>>                 Thanks,
>>>                 Christian
>>>
>>>                 btw,
>>>                 I have not been able to access the picture on the
>>>                 wiki
>>>                 <https://cwiki.apache.org/Hive/adminmanual-metastoreadmin.html#AdminManualMetastoreAdmin-MetastoreDeploymentOptionsinPictures>.
>>>                 ("Not permitted"; even though I have registered on
>>>                 the wiki)
>>>
>>>
>>>
>>>             hive.metastore.local is really misnamed.
>>>
>>>             local=true means communicate using datanucleus/JPOX and
>>>             talking directly to the metastore.
>>>
>>>             local=false means use thrift which is essentially a
>>>             level of indirection.
>>
>>         Talking about HiveService can confuse things because
>>         HiveService is a different thrift interface.
>>
>>         You could be setup like this:
>>         HiveServiceClient->HiveService->metastore.local=true->derby
>>         or
>>         HiveServiceClient->HiveService->metastore.local=false>thrift->hive_metastore
>>
>>         most people are setup like this:
>>
>>         HiveServiceClient->HiveService->metastore.local=true->mysql
>>         cli->metastore.local=true->mysql
>>
>>
>>
>

Mime
View raw message