hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Geovanie Marquez <geovanie.marq...@gmail.com>
Subject Re: RPC Client OutOfMemoryError Java Heap Space
Date Tue, 13 May 2014 14:07:39 GMT
The following property does exactly what I wanted our environment to do. I
had a 4GiB Heap and ran the job and no jobs failed. Then I dropped our
cluster heap to 1GiB and reran the same resource intensive task.

This property must be added to the "HBase Service Advanced Configuration
Snippet (Safety Valve) for hbase-site.xml"

<property>
<name>hbase.client.scanner.max.result.size</name>
<value>67108864</value>
</property>

We noted that 64MiB would be enough, but we also experimented 128MiB. I may
do a write-up and elaborate some more on this.


On Mon, May 12, 2014 at 1:38 PM, Vladimir Rodionov
<vrodionov@carrieriq.com>wrote:

> All your OOME are on the client side (map task). Your map tasks need more
> heap.
> Reduce # of map tasks and increase max heap size per map task.
>
> Best regards,
> Vladimir Rodionov
> Principal Platform Engineer
> Carrier IQ, www.carrieriq.com
> e-mail: vrodionov@carrieriq.com
>
> ________________________________________
> From: Geovanie Marquez [geovanie.marquez@gmail.com]
> Sent: Thursday, May 08, 2014 2:35 PM
> To: user@hbase.apache.org
> Subject: Re: RPC Client OutOfMemoryError Java Heap Space
>
> sorry didn't include version
>
> CDH5 version - CDH-5.0.0-1.cdh5.0.0.p0.47
>
>
> On Thu, May 8, 2014 at 5:32 PM, Geovanie Marquez <
> geovanie.marquez@gmail.com
> > wrote:
>
> > Hey group,
> >
> > There is one job that scans HBase contents and is really resource
> > intensive using all resources available to yarn (under Resource Manager).
> > In my case, that is 8GB. My expectation here is that a properly
> configured
> > cluster would kill the application or degrade the application performance
> > but never ever take a region server down. This is intended to be a
> > multi-tenant environment where developers may submit jobs at will and I
> > would want a configuration where the cluster services are not exited in
> > this way because of memory.
> >
> > The simple solution here, is to change the way the job consumes resources
> > so that when run it is not so resource greedy. I want to understand how I
> > can mitigate this situation in general.
> >
> > **It FAILS with the following config:**
> > The RPC client has 30 handlers
> > write buffer of 2MiB
> > The RegionServer heap is 4GiB
> > Max Size of all memstores is 0.40 of total heap
> > HFile Block Cache Size is 0.40
> > Low watermark for memstore flush is 0.38
> > HBase Memstore size is 128MiB
> >
> > **Job still FAILS with the following config:**
> > Everything else the same except
> > The RPC client has 10 handlers
> >
> > **Job still FAILS with the following config:**
> > Everything else the same except
> > HFile Block Cache Size is 0.10
> >
> >
> > When this runs I get the following error stacktrace:
> > #
> > #How do I avoid this via configuration.
> > #
> >
> > java.lang.OutOfMemoryError: Java heap space
> >       at
> org.apache.hadoop.hbase.ipc.RpcClient$Connection.readResponse(RpcClient.java:1100)
> >       at
> org.apache.hadoop.hbase.ipc.RpcClient$Connection.run(RpcClient.java:721)
> > 2014-05-08 16:23:54,705 WARN [IPC Client (1242056950) connection to
> c1d001.in.wellcentive.com/10.2.4.21:60020 from hbase]
> org.apache.hadoop.ipc.RpcClient: IPC Client (1242056950) connection to
> c1d001.in.wellcentive.com/10.2.4.21:60020 from hbase: unexpected
> exception receiving call responses
> > #
> >
> > ###Yes, there was an RPC timeout this is what is killing the server
> because the timeout is eventually (1minute later) reached.
> >
> > #
> >
> > java.lang.OutOfMemoryError: Java heap space
> >       at
> org.apache.hadoop.hbase.ipc.RpcClient$Connection.readResponse(RpcClient.java:1100)
> >       at
> org.apache.hadoop.hbase.ipc.RpcClient$Connection.run(RpcClient.java:721)
> > 2014-05-08 16:23:55,319 INFO [main]
> org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl: recovered from
> org.apache.hadoop.hbase.DoNotRetryIOException: Failed after retry of
> OutOfOrderScannerNextException: was there a rpc timeout?
> >       at
> org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:384)
> >       at
> org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl.nextKeyValue(TableRecordReaderImpl.java:194)
> >       at
> org.apache.hadoop.hbase.mapreduce.TableRecordReader.nextKeyValue(TableRecordReader.java:138)
> >       at
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533)
> >       at
> org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
> >       at
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
> >       at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> >       at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
> >       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
> >       at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
> >       at java.security.AccessController.doPrivileged(Native Method)
> >       at javax.security.auth.Subject.doAs(Subject.java:415)
> >       at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> >       at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
> >
> > #
> >
> > ## Probably caused by the OOME above
> >
> > #
> >
> > Caused by:
> org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException:
> org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException: Expected
> nextCallSeq: 1 But the nextCallSeq got from client: 0; request=scanner_id:
> 5612205039322936440 number_of_rows: 10000 close_scanner: false
> next_call_seq: 0
> >       at
> org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3018)
> >       at
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:26929)
> >       at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2175)
> >       at
> org.apache.hadoop.hbase.ipc.RpcServer$Handler.run(RpcServer.java:1879)
> >
> >
>
> Confidentiality Notice:  The information contained in this message,
> including any attachments hereto, may be confidential and is intended to be
> read only by the individual or entity to whom this message is addressed. If
> the reader of this message is not the intended recipient or an agent or
> designee of the intended recipient, please note that any review, use,
> disclosure or distribution of this message or its attachments, in any form,
> is strictly prohibited.  If you have received this message in error, please
> immediately notify the sender and/or Notifications@carrieriq.com and
> delete or destroy any copy of this message and its attachments.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message