hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: Put errors via thrift
Date Tue, 15 Feb 2011 17:56:05 GMT
Compactions are done in the background, they won't block writes.

Regarding splitting time, it could be that it had to retry a bunch of
times in such a way that the write timed out, but I can't say for sure
without the logs.

Have you considered using the bulk loader? I personally would never
try to insert a few billion rows via Thrift in a streaming job, sounds
like a recipe for trouble ;)

At the very least, you should consider pre-splitting your table so
that you don't have to wait after the splits, splitting only makes
sense when the data is slowly growing and not under an import. See
this API call: http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HBaseAdmin.html#createTable(org.apache.hadoop.hbase.HTableDescriptor,
byte[][])

J-D

On Tue, Feb 15, 2011 at 9:01 AM, Chris Tarnas <cft@email.com> wrote:
> I have a long running Hadoop streaming job that also puts about a billion sub 1kb rows
into Hbase via thrift, and last night I got quite a few errors like this one:
>
> Still had 34 puts left after retrying 10 times.
>
> Could that be caused by one or more long running compactions and a split? I'm using GZ
(license problems preclude LZO for the time being) and pretty much compactions and a split
were all that I saw in the logs. I'm sure the long running compactions were a result of raising
hbase.hstore.blockingStoreFile 20 and hbase.hregion.memstore.block.multiplier to 24, - that
worked well to circumvent HBASE-3483 and other pauses for the smaller ~50M row inserts we
had been doing.
>
> This is on a 10 datanode, each with 12 processors, 48GB RAM and 12 2TB drives. 3 other
nodes are the masters and zookeeper quorum.
>
> thanks,
> -chris

Mime
View raw message