hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bryan Jeffrey <bryan.jeff...@gmail.com>
Subject Re: Hive 0.13.0 Memory Leak
Date Mon, 19 May 2014 14:56:21 GMT
Shengjun,

I do not believe that it is the same issue.  I turned off conversion to ORC
on Friday and came in Monday to a box that was still out of memory.  At
this point we're simply making calls (load data infile, alter table add
partition) through BeeLine on a periodic but infrequent (once / 10s) basis.
 The issue described seems unlikely to be the problem as
the ipc.client.connection.maxidletime is set to default (10s), and if we
were seeing this issue the OOM error would occur more quickly.  We're
seeing memory use gradually increase over periods of several days.

Here is the stack dump from the last OOM we saw:
2014-05-17 11:34:09,381 ERROR thrift.ProcessFunction (?:?(?)) - Internal
error processing GetOperationStatus
2014-05-17 11:39:09,010 WARN  hdfs.DFSClient
(DFSOutputStream.java:run(613)) - DataStreamer Exception
java.lang.OutOfMemoryError: Java heap space
2014-05-17 11:39:09,009 WARN  ipc.Client (Client.java:run(897)) -
Unexpected error reading responses on connection Thread[IPC Client
(443054816) connection to server/192.168.1.100:8020 from root,5,main]
java.lang.OutOfMemoryError: Java heap space
  at com.google.protobuf.CodedInputStream.<init>(CodedInputStream.java:573)
  at
com.google.protobuf.CodedInputStream.newInstance(CodedInputStream.java:55)
  at
com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:199)
  at
com.google.protobuf.AbstractParser.parsePartialDelimitedFrom(AbstractParser.java:241)
  at
com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:253)
  at
com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:259)
  at
com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:49)
  at
org.apache.hadoop.ipc.protobuf.RpcHeaderProtos$RpcResponseHeaderProto.parseDelimitedFrom(RpcHeaderProtos.java:2364)
  at
org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:996)
  at org.apache.hadoop.ipc.Client$Connection.run(Client.java:891)
2014-05-17 11:43:38,541 ERROR metastore.RetryingHMSHandler
(RetryingHMSHandler.java:invoke(143)) - java.lang.OutOfMemoryError: Java
heap space

2014-05-17 11:44:13,986 WARN  thrift.ThriftCLIService
(ThriftCLIService.java:ExecuteStatement(350)) - Error executing statement:
java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap space
  at
org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:84)
  at
org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:37)
  at
org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:64)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:396)
  at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
  at
org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:500)
  at
org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:60)
  at com.sun.proxy.$Proxy15.executeStatementAsync(Unknown Source)
  at
org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:233)
  at
org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:344)
  at
org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1313)
  at
org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1298)
  at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
  at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
  at
org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:55)
  at
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
  at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
  at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
  at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.OutOfMemoryError: Java heap space
2014-05-17 11:53:46,306 WARN  thrift.ThriftCLIService
(ThriftCLIService.java:ExecuteStatement(350)) - Error executing statement:
java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap space
  at
org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:84)
  at
org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:37)
  at
org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:64)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:396)
  at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
  at
org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:500)
  at
org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:60)
  at com.sun.proxy.$Proxy15.executeStatementAsync(Unknown Source)
  at
org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:233)
  at
org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:344)
  at
org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1313)
  at
org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1298)
  at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
  at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
  at
org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:55)
  at
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
  at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
  at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
  at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.OutOfMemoryError: Java heap space
2014-05-17 11:55:14,678 ERROR thrift.ProcessFunction
(ProcessFunction.java:process(41)) - Internal error processing OpenSession
java.lang.OutOfMemoryError: Java heap space
2014-05-17 11:56:08,804 ERROR metastore.RetryingHMSHandler
(RetryingHMSHandler.java:invoke(143)) - java.lang.OutOfMemoryError: Java
heap space




On Sun, May 18, 2014 at 9:53 PM, Shengjun Xin <sxin@gopivotal.com> wrote:

> We found a hive jdbc connection leak
> https://issues.apache.org/jira/browse/HIVE-6866, is it similar to your
> issue?
>
>
> On Thu, May 15, 2014 at 9:29 PM, Bryan Jeffrey <bryan.jeffrey@gmail.com>wrote:
>
>> All,
>>
>> We are running Hadoop 2.2.0 and Hive 0.13.0.  One typical application is
>> to load data (as text), and then convert that data to ORC to decrease query
>> time.  When running these processes we are seeing significant memory leaks
>> (leaking 4 GB in about 5 days).
>>
>> We're running HiveServer2 with the following options:
>>
>> root     28136     1 51 May14 ?        09:51:09 /usr/java/latest/bin/java
>> -Xmx2048m -XX:NewRatio=12 -Xms10m -XX:MaxHeapFreeRatio=40
>> -XX:MinHeapFreeRatio=15 -XX:-UseGCOverheadLimit -XX:MaxPermSize=1024m
>> -XX:NewRatio=12 -Xms10m -XX:MaxHeapFreeRatio=40 -XX:MinHeapFreeRatio=15
>> -XX:-UseGCOverheadLimit -XX:MaxPermSize=1024m
>> -Dhadoop.log.dir=/opt/hadoop/latest-hadoop/logs
>> -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/opt/hadoop/latest-hadoop
>> -Dhadoop.id.str= -Dhadoop.root.logger=INFO,console
>> -Djava.library.path=/opt/hadoop/latest-hadoop/lib/native
>> -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true
>> -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.util.RunJar
>> /opt/hadoop/latest-hive/lib/hive-service-0.13.0.jar
>> org.apache.hive.service.server.HiveServer2
>>
>> Typical ORC conversion query looks like the following:
>>
>> HIVE -u jdbc:hive2://hive_server:10002/db -n root --hiveconf
>> hive.exec.dynamic.partition.mode=nonstrict --hiveconf
>> hive.enforce.sorting=true --hiveconf $SET_QUEUE -e "insert into table
>> orc_table partition (range) select * from loading_text_table; "
>>
>> I saw a couple of tickets for memory leaks, but they seemed to deal with
>> failed queries.  The memory usage increases in a linear fashion.  Jobs all
>> succeed until memory limit exceeded.
>>
>> Is there an open bug for memory leaks associated with successful jobs in
>> HS2?  Is there a fix for this issue?
>>
>> Regards,
>>
>> Bryan Jeffrey
>>
>
>
>
> --
> Regards
> Shengjun
>

Mime
View raw message