hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jean-Daniel Cryans" <jdcry...@gmail.com>
Subject Re: Hbase regionserver heap space problem
Date Thu, 10 Jul 2008 10:53:24 GMT
Hi Marcus,

I don't know if it's related to your problem but in your machine setup you
seem to imply that you have one region server and 3 datanodes on four
different machines. If it's really the case, I recommend that you instead
have 1 machine for the Namenode and Master and three other machines as
Datanodes and RegionServers.

J-D

On Thu, Jul 10, 2008 at 5:32 AM, Marcus Schl├╝ter <marcus.schlueter@mac.com>
wrote:

> Hi everyone,
>
> We would like to use Hbase and Hadoop.
> But when we tried to use real data with our test setup, we saw a lot of
> crashes and could not succeed to insert the amount of data we are trying to
> insert into an Hbase table.
> Our goal is to have about 100 million of rows in one table, with each row
> having about 100byte of raw data.
> Our testsetup consists of the following servers:
>
> 3 x HP DL385 with 4GB RAM, 2x2,8Ghz Opterons and Smartarray RAID5 with an
> capacity of 400GB. (all used as datanodes, and one of them also as the
> namenode)
> 1 x HP DL380 with 3GB RAM, 2x3,4Ghz Dualcore Xeons and Smartarray RAID5
> with an capacity of 320GB for hbase (master and regionserver).
>
> We used hadoop 0.16.4 with a replaction level of 2 and hbase 0.1.3.
> Hbase is configured to use 2GB of heap space.
> The table was created with the following query:
>
> create table logdata (logtype MAX_VERSIONS=1 COMPRESSION=BLOCK, banner_id
> MAX_VERSIONS=1, contentunit_id MAX_VERSIONS=1, campaign_id MAX_VERSIONS=1,
> network MAX_VERSIONS=1, geodata MAX_VERSIONS=1 COMPRESSION=BLOCK,
> client_data MAX_VERSIONS=1 COMPRESSION=BLOCK, profile_data MAX_VERSIONS=1
> COMPRESSION=BLOCK, keyword MAX_VERSIONS=1 COMPRESSION=BLOCK, tstamp
> MAX_VERSIONS=1, time MAX_VERSIONS=1);
>
>
> there problem is, that the regionserver runs out of heap space and throws
> the following exception after inserting a few million rows (not always the
> same number of rows, ranging from 3 to about 10 million):
>
> Exception in thread "org.apache.hadoop.dfs.DFSClient$LeaseChecker@69e328e0"
> java.lang.OutOfMemoryError: Java heap space
>        at java.io.DataInputStream.<init>(DataInputStream.java:42)
>        at
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:186)
>        at org.apache.hadoop.ipc.Client.getConnection(Client.java:578)
>        at org.apache.hadoop.ipc.Client.call(Client.java:501)
>        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:198)
>        at org.apache.hadoop.dfs.$Proxy1.renewLease(Unknown Source)
>        at sun.reflect.GeneratedMethodAccessor25.invoke(Unknown Source)
>        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
>        at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
>        at org.apache.hadoop.dfs.$Proxy1.renewLease(Unknown Source)
>        at
> org.apache.hadoop.dfs.DFSClient$LeaseChecker.run(DFSClient.java:596)
>        at java.lang.Thread.run(Thread.java:619)
> Exception in thread "ResponseProcessor for block blk_7988192980299756280"
> java.lang.OutOfMemoryError: Java heap space
> Exception in thread "IPC Server Responder" Exception in thread
> "org.apache.hadoop.io.ObjectWritable Connection Culler" Exception in thread
> "IPC Client connection to /192.168.1.117:54310"
> java.lang.OutOfMemoryError: Java heap space
> java.lang.OutOfMemoryError: Java heap space
> java.lang.OutOfMemoryError: Java heap space
>
> any idears why we always see this crashes and if hbase should be able to
> handle this amount of data in the setup we use?
>
> On a side note, we also observe that hbase seems to have a large storage
> overhead.
> When we insert about 1GB of rawdata into hbase, it uses about 8GB of HDFS
> space (when taking into account the replication).
> Is this large overhead expected?
>
> /Marcus
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message