hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gilad Wolff (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-6893) out of sequence error in HiveMetastore server
Date Tue, 24 Jun 2014 00:14:25 GMT

    [ https://issues.apache.org/jira/browse/HIVE-6893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14041510#comment-14041510
] 

Gilad Wolff commented on HIVE-6893:
-----------------------------------

I encountered the same issue, we get a socket read timeout and then out-of-sequence error.
In one case we got an OOM in our client and I suspect it's the same underlying issue. Here
is the metastore sequence of events. Our client tried to drop a table starting at 14:02:25.
Note that we use a 20 seconds timeout for our client:
{code}
2014-06-23 14:02:25,181 INFO org.apache.hadoop.hive.metastore.HiveMetaStore: 11: source:/10.20.93.47
drop_table : db=cloudera_manager_metastore_canary_test_db tbl=CM_TEST_TABLE
2014-06-23 14:02:25,181 INFO org.apache.hadoop.hive.metastore.HiveMetaStore.audit: ugi=hue
ip=/10.20.93.47	cmd=source:/10.20.93.47 drop_table : db=cloudera_manager_metastore_canary_test_db
tbl=CM_TEST_TABLE	
2014-06-23 14:02:25,182 INFO org.apache.hadoop.hive.metastore.HiveMetaStore: 11: source:/10.20.93.47
get_table : db=cloudera_manager_metastore_canary_test_db tbl=CM_TEST_TABLE
2014-06-23 14:02:25,182 INFO org.apache.hadoop.hive.metastore.HiveMetaStore.audit: ugi=hue
ip=/10.20.93.47	cmd=source:/10.20.93.47 get_table : db=cloudera_manager_metastore_canary_test_db
tbl=CM_TEST_TABLE	
2014-06-23 14:02:46,596 INFO hive.metastore.hivemetastoressimpl: deleting  hdfs://jenkins-debian60-17.ent.cloudera.com:8020/user/hue/.cloudera_manager_hive_metastore_canary/HIVE_1_HIVEMETASTORE_627a77825bb851bf2db30317a698dded/2014_06_23_14_02_11/cm_test_table
2014-06-23 14:02:46,694 INFO hive.metastore.hivemetastoressimpl: Moved to trash: hdfs://jenkins-debian60-17.ent.cloudera.com:8020/user/hue/.cloudera_manager_hive_metastore_canary/HIVE_1_HIVEMETASTORE_627a77825bb851bf2db30317a698dded/2014_06_23_14_02_11/cm_test_table
{code}

On our client we get a socket timeout for the drop table call at 14:02:45:
{code}
2:02:45.209 PM 	WARN 	com.cloudera.cmon.firehose.polling.hive.HiveMetastoreCanary Metastore
HIVE-1-HIVEMETASTORE-627a77825bb851bf2db30317a698dded: Failed to drop table 
com.cloudera.cmf.cdhclient.common.hive.MetaException: java.net.SocketTimeoutException: Read
timed out
{code}
we then try to drop the database immediately afterwards and the next message in our logs is:
{code}
2:02:46.697 PM 	WARN 	com.cloudera.cmf.cdh4client.hive.MetastoreClientImpl 	Could not drop
hive database: cloudera_manager_metastore_canary_test_db
com.cloudera.cdh4client.hive.shaded.org.apache.thrift.TApplicationException: get_database
failed: out of sequence response
	at com.cloudera.cdh4client.hive.shaded.org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:76)
	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_database(ThriftHiveMetastore.java:412)
	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_database(ThriftHiveMetastore.java:399)
	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabase(HiveMetaStoreClient.java:736)
	at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.dropDatabase(HiveMetaStoreClient.java:479)
	at com.cloudera.cmf.cdh4client.hive.MetastoreClientImpl.dropDatabase(MetastoreClientImpl.java:160)
{code}

Note that the moved-to-trash message in the hive metastore is from 14:02:46,694 and the out-of-order
exception is from 2:02:46.697. I know that order-in-time does not imply causation but is it
possible that we are getting the drop-table acknowledgment message instead of the get_database
message?

> out of sequence error in HiveMetastore server
> ---------------------------------------------
>
>                 Key: HIVE-6893
>                 URL: https://issues.apache.org/jira/browse/HIVE-6893
>             Project: Hive
>          Issue Type: Bug
>          Components: HiveServer2
>    Affects Versions: 0.12.0
>            Reporter: Romain Rigaux
>            Assignee: Naveen Gangam
>             Fix For: 0.14.0
>
>         Attachments: HIVE-6893.1.patch
>
>
> Calls listing databases or tables fail. It seems to be a concurrency problem.
> {code}
> 014-03-06 05:34:00,785 ERROR hive.log: org.apache.thrift.TApplicationException: get_databases
failed: out of sequence response
>         at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:76)
>         at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_databases(ThriftHiveMetastore.java:472)
>         at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_databases(ThriftHiveMetastore.java:459)
>         at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabases(HiveMetaStoreClient.java:648)
>         at org.apache.hive.service.cli.operation.GetSchemasOperation.run(GetSchemasOperation.java:66)
>         at org.apache.hive.service.cli.session.HiveSessionImpl.getSchemas(HiveSessionImpl.java:278)
>         at sun.reflect.GeneratedMethodAccessor323.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:62)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
>         at org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:582)
>         at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:57)
>         at com.sun.proxy.$Proxy9.getSchemas(Unknown Source)
>         at org.apache.hive.service.cli.CLIService.getSchemas(CLIService.java:192)
>         at org.apache.hive.service.cli.thrift.ThriftCLIService.GetSchemas(ThriftCLIService.java:263)
>         at org.apache.hive.service.cli.thrift.TCLIService$Processor$GetSchemas.getResult(TCLIService.java:1433)
>         at org.apache.hive.service.cli.thrift.TCLIService$Processor$GetSchemas.getResult(TCLIService.java:1418)
>         at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>         at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>         at org.apache.hive.service.cli.thrift.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:38)
>         at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:244)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:724)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message