hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lars George <lars.geo...@gmail.com>
Subject Re: map task performance degradation - any idea why?
Date Fri, 19 Nov 2010 14:21:12 GMT
Hi Henning,

Could you look at the Master UI while doing the import? The issue with
a cold bulk import is that you are hitting one region server
initially, and while it is filling up its in-memory structures all is
nice and dandy. Then ou start to tax the server as it has to flush
data out and it becomes slower responding to the mappers still
hammering it. Only after a while the regions become large enough so
that they get split and load starts to spread across 2 machines, then
3. Eventually you have enough regions to handle your data and you will
see an average of the performance you could expect from a loaded
cluster. For that reason we have added a bulk loading feature that
helps building the region files externally and then swap them in.

When you check the UI you can actually see this behavior as the
operations-per-second (ops) are bound to one server initially. Well,
could be two as one of them has to also serve META. If that is the
same machine then you are penalized twice.

In addition you start to run into minor compaction load while HBase
tries to do housekeeping during your load.

With 0.89 you could pre-split the regions into what you see eventually
when your job is complete. Please use the UI to check and let us know
how many regions you end up with in total (out of interest mainly). If
you had that done before the import then the load is split right from
the start.

In general 0.89 is much better performance wise when it comes to bulk
loads so you may want to try it out as well. The 0.90RC is up so a
release is imminent and saves you from having to upgrade soon. Also,
0.90 is the first with Hadoop's append fix, so that you do not lose
any data from wonky server behavior.

And to wrap this up, 3 data nodes is not too great. If you ask anyone
with a serious production type setup you will see 10 machines and
more, I'd say 20-30 machines and up. Some would say "Use MySQL for
this little data" but that is not fair given that we do not know what
your targets are. Bottom line is, you will see issues (like slowness)
with 3 nodes that 8 or 10 nodes will never show.


On Fri, Nov 19, 2010 at 2:09 PM, Henning Blohm <henning.blohm@zfabrik.de> wrote:
> We have a Hadoop 0.20.2 + Hbase 0.20.6 setup with three data nodes
> (12GB, 1.5TB each) and one master node (24GB, 1.5TB). We store a
> relatively simple
> table in HBase (1 column familiy, 5 columns, rowkey about 100chars).
> In order to better understand the load behavior, I wanted to put 5*10^8
> rows into that table. I wrote an M/R job that uses a Split Input Format
> to split the
> 5*10^8 logical row keys (essentially just counting from 0 to 5*10^8-1)
> into 1000 chunks of 500000 keys and then let the map do the actual job
> of writing the corresponding rows (with some random column values) into
> hbase.
> So there are 1000 map tasks, no reducer. Each task writes 500000 rows
> into Hbase. We have 6 mapper slots, i.e. 24 mappers running parallel.
> The whole job runs for approx. 48 hours. Initially the map tasks need
> around 30 min. each. After a while things take longer and longer,
> eventually
> reaching > 2h. It tops around the 850s task after which things speed up
> again improving to about 48min. in the end, until completed.
> It's all dedicated machines and there is nothing else running. The map
> tasks have 200m heap and when checking with vmstat in between I cannot
> observe swapping.
> Also, on the master it seems that heap utilization is not at the limit
> and no swapping either. All Hadoop and Hbase processes have
> 1G heap.
> Any idea what would cause the strong variation (or degradation) of write
> performance?
> Is there a way of finding out where time gets lost?
> Thanks,
>  Henning

View raw message