hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marcin Cylke <mcl.hb...@touk.pl>
Subject Re: performance of Get from MR Job
Date Wed, 20 Jun 2012 06:35:32 GMT
On 19/06/12 19:31, Jean-Daniel Cryans wrote:
> This is a common but hard problem. I do not have a good answer.

Thanks for Your writeup. You've given a few suggestions, that I will
surely follow.

But what is bothering me, is my use of timestamps. As mentioned before,
my column family has 2147483646 versions allowed. I store data there
using those timestamps - a few rows with the same key but different
timestamp. Preparing GETs with timestamp, for TimeRange {0, Timestamp}
my performance is slopy (~130/sec). But setting doing sth like
{timestamp-10000, timestamp} results in great speed improvement (~400/sec).

Despite the {timestamp-10000, timestamp} being unrealistic in my
situation, the whole issue seems strange, and thus related in some way
to the use of timestamps.

Would You recommend trying with complex keys - build of timestamp+my
current key? Or this shouldn't change that much?


> Finally kind of like Paul said, if you can emit your rows and somehow
> batch them reducer-side in order to either do short scans or multi-get
> (see HTable.get(List<Get>)) it could be faster.

I'll try this solution, but I'm not that optimistic about it. I'll let
You know whether this helped or not.

Regards
Marcin


Mime
View raw message