hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Holstad <erikhols...@gmail.com>
Subject Re: Map Reduce performance
Date Wed, 24 Jun 2009 16:06:00 GMT
Hi Ramesh!
Have to agree with Tim about the size of your cluster, I honestly a little
bit surprised that you are actually seeing
that using MR on a single node is faster, since you only get the negative
sides, setup and so on from it, but not
the good stuff.
I looked at the code and it looks good, not really doing to much in the Job,
but I doesn't look like you are doing
anything wrong. I do have some things you can think about thought when you
get a bigger cluster up and running.
1. You might want to stay away from creating Text object, we are internally
trying to move away from all usage of Text in HBase and just use
ImmutableBytesWritable or something like that.
2. Getting a HTable is expensive, so you might want to create a pool of
those connections that you can share so you don't have to get a new one for
every task, not 100% sure about the configure call, but I think it gives you
one per call, might be worth looking into.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message