hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Geovanie Marquez <geovanie.marq...@gmail.com>
Subject Re: RPC Client OutOfMemoryError Java Heap Space
Date Fri, 09 May 2014 11:50:36 GMT
Is this an expectation problem or a legitimate concern. I have been
studying the memory configurations on cloudera manager and I don't seem to
see where I can improve my situation.




On Thu, May 8, 2014 at 5:35 PM, Geovanie Marquez <geovanie.marquez@gmail.com
> wrote:

> sorry didn't include version
>
> CDH5 version - CDH-5.0.0-1.cdh5.0.0.p0.47
>
>
> On Thu, May 8, 2014 at 5:32 PM, Geovanie Marquez <
> geovanie.marquez@gmail.com> wrote:
>
>> Hey group,
>>
>> There is one job that scans HBase contents and is really resource
>> intensive using all resources available to yarn (under Resource Manager).
>> In my case, that is 8GB. My expectation here is that a properly configured
>> cluster would kill the application or degrade the application performance
>> but never ever take a region server down. This is intended to be a
>> multi-tenant environment where developers may submit jobs at will and I
>> would want a configuration where the cluster services are not exited in
>> this way because of memory.
>>
>> The simple solution here, is to change the way the job consumes resources
>> so that when run it is not so resource greedy. I want to understand how I
>> can mitigate this situation in general.
>>
>> **It FAILS with the following config:**
>> The RPC client has 30 handlers
>> write buffer of 2MiB
>> The RegionServer heap is 4GiB
>> Max Size of all memstores is 0.40 of total heap
>> HFile Block Cache Size is 0.40
>> Low watermark for memstore flush is 0.38
>> HBase Memstore size is 128MiB
>>
>> **Job still FAILS with the following config:**
>> Everything else the same except
>> The RPC client has 10 handlers
>>
>> **Job still FAILS with the following config:**
>> Everything else the same except
>> HFile Block Cache Size is 0.10
>>
>>
>> When this runs I get the following error stacktrace:
>> #
>> #How do I avoid this via configuration.
>> #
>>
>> java.lang.OutOfMemoryError: Java heap space
>> 	at org.apache.hadoop.hbase.ipc.RpcClient$Connection.readResponse(RpcClient.java:1100)
>> 	at org.apache.hadoop.hbase.ipc.RpcClient$Connection.run(RpcClient.java:721)
>> 2014-05-08 16:23:54,705 WARN [IPC Client (1242056950) connection to c1d001.in.wellcentive.com/10.2.4.21:60020
from hbase] org.apache.hadoop.ipc.RpcClient: IPC Client (1242056950) connection to c1d001.in.wellcentive.com/10.2.4.21:60020
from hbase: unexpected exception receiving call responses
>> #
>>
>> ###Yes, there was an RPC timeout this is what is killing the server because the timeout
is eventually (1minute later) reached.
>>
>> #
>>
>> java.lang.OutOfMemoryError: Java heap space
>> 	at org.apache.hadoop.hbase.ipc.RpcClient$Connection.readResponse(RpcClient.java:1100)
>> 	at org.apache.hadoop.hbase.ipc.RpcClient$Connection.run(RpcClient.java:721)
>> 2014-05-08 16:23:55,319 INFO [main] org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl:
recovered from org.apache.hadoop.hbase.DoNotRetryIOException: Failed after retry of OutOfOrderScannerNextException:
was there a rpc timeout?
>> 	at org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:384)
>> 	at org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl.nextKeyValue(TableRecordReaderImpl.java:194)
>> 	at org.apache.hadoop.hbase.mapreduce.TableRecordReader.nextKeyValue(TableRecordReader.java:138)
>> 	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533)
>> 	at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
>> 	at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
>> 	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
>> 	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>> 	at java.security.AccessController.doPrivileged(Native Method)
>> 	at javax.security.auth.Subject.doAs(Subject.java:415)
>> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>> 	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
>>
>> #
>>
>> ## Probably caused by the OOME above
>>
>> #
>>
>> Caused by: org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException: org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException:
Expected nextCallSeq: 1 But the nextCallSeq got from client: 0; request=scanner_id: 5612205039322936440
number_of_rows: 10000 close_scanner: false next_call_seq: 0
>> 	at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3018)
>> 	at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:26929)
>> 	at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2175)
>> 	at org.apache.hadoop.hbase.ipc.RpcServer$Handler.run(RpcServer.java:1879)
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message