hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bryan Keller <brya...@gmail.com>
Subject Long client pauses with compression
Date Sun, 13 Mar 2011 08:14:55 GMT
I am using the Java client API to write 10,000 rows with about 6000 columns each, via 8 threads
making multiple calls to the HTable.put(List<Put>) method. I start with an empty table
with one column family and no regions pre-created.

With compression turned off, I am seeing very stable performance. At the start there are a
couple of 10-20sec  pauses where all insert threads are blocked during a region split. Subsequent
splits do not cause all of the threads to block, presumably because there are more regions
so no one region split blocks all inserts. GCs for HBase during the insert is not a major
problem (6k/55sec).

When using either LZO or gzip compression, however, I am seeing frequent and long pauses,
sometimes around 20 sec but often over 80 seconds in my test. During these pauses all 8 of
the threads writing to HBase are blocked. The pauses happen throughout the insert process.
GCs are higher in HBase when using compression (60k, 4min), but it doesn't seem enough to
explain these pauses. Overall performance obviously suffers dramatically as a result (about
2x slower).

I have tested this in different configurations (single node, 4 nodes) with the same result.
I'm using HBase 0.90.1 (CDH3B4), Sun/Oracle Java 1.6.0_24, CentOS 5.5, Hadoop LZO 0.4.10 from
Cloudera. Machines have 12 cores and 24 gb of RAM. Settings are pretty much default, nothing
out of the ordinary. I tried playing around with region handler count and memstore settings,
but these had no effect.

View raw message