hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: very slow scan performance on just one region
Date Wed, 21 Dec 2011 17:33:05 GMT
On Wed, Dec 21, 2011 at 8:56 AM, Daniel Iancu <daniel.iancu@1and1.ro> wrote:
> Hi there
> I'm investigating a problem we have with a MR job and I discovered that the
> tasks that fail (scan lease expired while fetching next row) were processing
> one particular region.
> I've written a small app that scans that region and counts its rows and run
> it on same machine where region is hosted. The result is very very poor,
> scan speed is in average 7 rows/sec and sometimes when scan caching is
> increased it gets lease expired exception. By contrary, scanning the other
> regions from same table on same machine with same caching value gets ~3800
> rows/sec. Any idea what can cause such dizastrous scan performance on a
> particular region ?

If you move the region to another host, do you same same perf (Perhaps
some hardware issue?).

Otherwise, if you look at the data under that region, what do you see.
 First do a listing of the hdfs content.  Next try looking at the
actual key values with the hfile main tool: Poke down in here

> Some extra info
> hbase is 0.90.4
> lease timeout is 4 minutes
> table has 1 family, cell values are empty, row keys and qualifiers are small
> strings, biggest row has 146 columns
> row sizes are almost identical since table was create by a load tool and
> each row has almost the same nr of colums with same kind of values...
> all regions have 1 store file of ~655MB
> cluster has no activity except the test app
> GC activity looks normal
> regions might have many deleted KV (we were testing data cleanup with MR
> jobs)

Looksee first w/ hfile tool.

If a major compaction 'fixes' it, then it could be having to pass over
lots of delete items.


View raw message