hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Geovanie Marquez <geovanie.marq...@gmail.com>
Subject RPC Client OutOfMemoryError Java Heap Space
Date Thu, 08 May 2014 21:32:55 GMT
Hey group,

There is one job that scans HBase contents and is really resource intensive
using all resources available to yarn (under Resource Manager). In my case,
that is 8GB. My expectation here is that a properly configured cluster
would kill the application or degrade the application performance but never
ever take a region server down. This is intended to be a multi-tenant
environment where developers may submit jobs at will and I would want a
configuration where the cluster services are not exited in this way because
of memory.

The simple solution here, is to change the way the job consumes resources
so that when run it is not so resource greedy. I want to understand how I
can mitigate this situation in general.

**It FAILS with the following config:**
The RPC client has 30 handlers
write buffer of 2MiB
The RegionServer heap is 4GiB
Max Size of all memstores is 0.40 of total heap
HFile Block Cache Size is 0.40
Low watermark for memstore flush is 0.38
HBase Memstore size is 128MiB

**Job still FAILS with the following config:**
Everything else the same except
The RPC client has 10 handlers

**Job still FAILS with the following config:**
Everything else the same except
HFile Block Cache Size is 0.10


When this runs I get the following error stacktrace:
#
#How do I avoid this via configuration.
#

java.lang.OutOfMemoryError: Java heap space
	at org.apache.hadoop.hbase.ipc.RpcClient$Connection.readResponse(RpcClient.java:1100)
	at org.apache.hadoop.hbase.ipc.RpcClient$Connection.run(RpcClient.java:721)
2014-05-08 16:23:54,705 WARN [IPC Client (1242056950) connection to
c1d001.in.wellcentive.com/10.2.4.21:60020 from hbase]
org.apache.hadoop.ipc.RpcClient: IPC Client (1242056950) connection to
c1d001.in.wellcentive.com/10.2.4.21:60020 from hbase: unexpected
exception receiving call responses
#

###Yes, there was an RPC timeout this is what is killing the server
because the timeout is eventually (1minute later) reached.

#

java.lang.OutOfMemoryError: Java heap space
	at org.apache.hadoop.hbase.ipc.RpcClient$Connection.readResponse(RpcClient.java:1100)
	at org.apache.hadoop.hbase.ipc.RpcClient$Connection.run(RpcClient.java:721)
2014-05-08 16:23:55,319 INFO [main]
org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl: recovered
from org.apache.hadoop.hbase.DoNotRetryIOException: Failed after retry
of OutOfOrderScannerNextException: was there a rpc timeout?
	at org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:384)
	at org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl.nextKeyValue(TableRecordReaderImpl.java:194)
	at org.apache.hadoop.hbase.mapreduce.TableRecordReader.nextKeyValue(TableRecordReader.java:138)
	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533)
	at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
	at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)

#

## Probably caused by the OOME above

#

Caused by: org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException:
org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException:
Expected nextCallSeq: 1 But the nextCallSeq got from client: 0;
request=scanner_id: 5612205039322936440 number_of_rows: 10000
close_scanner: false next_call_seq: 0
	at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3018)
	at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:26929)
	at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2175)
	at org.apache.hadoop.hbase.ipc.RpcServer$Handler.run(RpcServer.java:1879)

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message