hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lefty Leverenz <leftylever...@gmail.com>
Subject Fwd: [jira] [Commented] (HCATALOG-541) The meta store client throws TimeOut exception if ~1000 clients are trying to call listPartition on the server
Date Thu, 22 Jan 2015 00:20:46 GMT
Now that HCatalog is part of the Hive project, messages about HCATALOG-###
issues should go to dev@hive.apache.org.

-- Lefty

---------- Forwarded message ----------
From: Manish Malhotra (JIRA) <jira@apache.org>
Date: Wed, Jan 21, 2015 at 9:27 AM
Subject: [jira] [Commented] (HCATALOG-541) The meta store client throws
TimeOut exception if ~1000 clients are trying to call listPartition on the
server
To: hcatalog-dev@incubator.apache.org



    [
https://issues.apache.org/jira/browse/HCATALOG-541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14285924#comment-14285924
]

Manish Malhotra commented on HCATALOG-541:
------------------------------------------

Hi Travis and Arup,

I'm also facing similar problem while using Hive Thrift Server but without
HCatalog.
But I didnt see OOM error in the thrift server logs.

Pattern is mostly when the load on the Hive thrift server is high ( mostly
when most of the Hive ETL jobs are running) some time it start getting into
the mode where it doesnt respond in time and throws Socket Timeout.

And this happens for different operations and not only for list partitions.

Please update, if there is any update on this ticket, that might help my
situation as well.

Regards,
Manish

Stack Trace:

 at
org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
    at
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
    at
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
    at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
    at
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_database(ThriftHiveMetastore.java:412)
    at
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_database(ThriftHiveMetastore.java:399)
    at
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabase(HiveMetaStoreClient.java:736)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:601)
    at
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:74)
    at $Proxy7.getDatabase(Unknown Source)
    at org.apache.hadoop.hive.ql.metadata.Hive.getDatabase(Hive.java:1110)
    at
org.apache.hadoop.hive.ql.metadata.Hive.databaseExists(Hive.java:1099)
    at org.apache.hadoop.hive.ql.exec.DDLTask.showTables(DDLTask.java:2206)
    at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:334)
    at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)
    at
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
    at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1336)
    at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1122)
    at org.apache.hadoop.hive.ql.Driver.run(Driver.java:935)
    at
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
    at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
    at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:412)
    at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:347)
    at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:706)
    at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:613)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:601)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
Caused by: java.net.SocketTimeoutException: Read timed out
    at java.net.SocketInputStream.socketRead0(Native Method)
    at java.net.SocketInputStream.read(SocketInputStream.java:150)
    at java.net.SocketInputStream.read(SocketInputStream.java:121)
    at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
    at java.io.BufferedInputStream.read1(BufferedInputStream.java:275)
    at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
    at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
    ... 34 more
2015-01-20 22:44:12,978 ERROR exec.Task (SessionState.java:printError(401))
- FAILED: Error in metadata:
org.apache.thrift.transport.TTransportException:
java.net.SocketTimeoutException: Read timed out
org.apache.hadoop.hive.ql.metadata.HiveException:
org.apache.thrift.transport.TTransportException:
java.net.SocketTimeoutException: Read timed out
    at org.apache.hadoop.hive.ql.metadata.Hive.getDatabase(Hive.java:1114)
    at
org.apache.hadoop.hive.ql.metadata.Hive.databaseExists(Hive.java:1099)
    at org.apache.hadoop.hive.ql.exec.DDLTask.showTables(DDLTask.java:2206)
    at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:334)
    at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)
    at
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
    at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1336)
    at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1122)
    at org.apache.hadoop.hive.ql.Driver.run(Driver.java:935)
    at
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
    at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)


> The meta store client throws TimeOut exception if ~1000 clients are
trying to call listPartition on the server
>
--------------------------------------------------------------------------------------------------------------
>
>                 Key: HCATALOG-541
>                 URL: https://issues.apache.org/jira/browse/HCATALOG-541
>             Project: HCatalog
>          Issue Type: Improvement
>         Environment: Hadoop 0.23.4
> Hcatalog 0.4
> Oracle
>            Reporter: Arup Malakar
>
> Error on the client:
> {code}
> 2012-10-24 21:44:03,942 INFO [pool-12-thread-2]
org.apache.hcatalog.hcatmix.load.tasks.Task: Error listing partitions
> org.apache.thrift.transport.TTransportException:
java.net.SocketTimeoutException: Read timed out
>         at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129)
>         at
org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
>         at
org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:345)
>         at
org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:422)
>         at
org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:404)
>         at
org.apache.thrift.transport.TSaslClientTransport.read(TSaslClientTransport.java:37)
>         at
org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
>         at
org.apache.hadoop.hive.thrift.TFilterTransport.readAll(TFilterTransport.java:62)
>         at
org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
>         at
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
>         at
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
>         at
org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69)
>         at
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_partitions(ThriftHiveMetastore.java:1208)
>         at
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_partitions(ThriftHiveMetastore.java:1193)
>         at
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.listPartitions(HiveMetaStoreClient.java:631)
>         at
org.apache.hcatalog.hcatmix.load.tasks.HCatListPartitionTask.doTask(HCatListPartitionTask.java:45)
>         at
org.apache.hcatalog.hcatmix.load.TaskExecutor.call(TaskExecutor.java:79)
>         at
org.apache.hcatalog.hcatmix.load.TaskExecutor.call(TaskExecutor.java:39)
>         at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>         at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:619)
> Caused by: java.net.SocketTimeoutException: Read timed out
>         at java.net.SocketInputStream.socketRead0(Native Method)
>         at java.net.SocketInputStream.read(SocketInputStream.java:129)
     at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
> {code}
> Error on the server:
> {code}
> Exception in thread "pool-1-thread-3206" java.lang.OutOfMemoryError:
unable to create new native thread
>         at java.lang.Thread.start0(Native Method)
>         at java.lang.Thread.start(Thread.java:597)
>         at
org.datanucleus.store.query.Query.performExecuteTask(Query.java:1891)
>         at
org.datanucleus.store.rdbms.query.JDOQLQuery.performExecute(JDOQLQuery.java:613)
>         at org.datanucleus.store.query.Query.executeQuery(Query.java:1692)
>         at
org.datanucleus.store.query.Query.executeWithArray(Query.java:1527)
>         at org.datanucleus.jdo.JDOQuery.execute(JDOQuery.java:266)
>         at
org.apache.hadoop.hive.metastore.ObjectStore.listMPartitions(ObjectStore.java:1521)
>         at
org.apache.hadoop.hive.metastore.ObjectStore.getPartitions(ObjectStore.java:1268)
>         at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
>         at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at
org.apache.hadoop.hive.metastore.RetryingRawStore.invoke(RetryingRawStore.java:111)
>         at $Proxy7.getPartitions(Unknown Source)
>         at
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions(HiveMetaStore.java:1468)
>         at
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_partitions.getResult(ThriftHiveMetastore.java:5318)
>         at
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_partitions.getResult(ThriftHiveMetastore.java:5306)
>         at
org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32)
>         at
org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34)
>         at
org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge20S$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge20S.java:555)
>         at
org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge20S$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge20S.java:552)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1212)
>         at
org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge20S$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge20S.java:552)
>         at
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run_aroundBody0(TThreadPoolServer.java:176)
>         at
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run_aroundBody1$advice(TThreadPoolServer.java:101)
>         at
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:1)
>         at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:619)
> {code}
> The graph for concurrent usage of list partition can be seen here:
>
https://cwiki.apache.org/confluence/download/attachments/30740331/hcatmix_list_partition_loadtest_25min.html
> The table has 2000 partitions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message