hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Silberstein <silbe...@yahoo-inc.com>
Subject Re: M/R on bulk imported tables
Date Tue, 25 May 2010 06:17:12 GMT
Hi Todd,
It's 0.20.3, and yes on multiple reducers.  We can try moving up to 0.20.4
(and also a quick test with a single reducer).


On 5/24/10 11:04 PM, "Todd Lipcon" <todd@cloudera.com> wrote:

> Hi Adam,
> What version are you running, and are you using multiple reducers in your
> HFileOutputFormat job? There was a bug in 0.20.3 which caused this case to
> produce somewhat broken tables.
> -Todd
> On Mon, May 24, 2010 at 10:55 PM, Adam Silberstein
> <silberst@yahoo-inc.com>wrote:
>> Hi,
>> A colleague and I are working on testing a few HBase features, notably bulk
>> import (mentioned in
>> http://hadoop.apache.org/hbase/docs/current/api/org/apache/hadoop/hbase/mapr
>> educe/package-summary.html) and running M/R jobs using HBase as input.
>> We¹re taking the following steps:
>> 1a. Load HBase with a M/R job using the normal API.
>> OR
>> 1b. Load HBase with bulk import.
>> 2a. Using the shell, do a ³count² over the table.
>> OR
>> 2b. Run a M/R job that scans the whole HBase table (and nothing else).
>> Of the 4 combos, 3 are fine: 1a+2a, 1a+2b, 1b+2a.  We¹re having trouble
>> with
>> 1b+2b.  When we run the M/R job, it doesn¹t seem to read in any records,
>> but
>> there are no explicit errors in either the Hadoop or HBase logs.
>> This seems odd.  It shouldn¹t matter how we load the table, and the shell¹s
>> count operator seems to work correctly either way, counting all the
>> records.
>> The M/R job in 2b is the same no matter how we load the table.  Any ideas
>> on
>> what might be wrong with the bulk import to cause this problem?  We¹re
>> thinking maybe something with the region boundaries, although they look ok
>> in the GUI.
>> Thanks for any suggestions,
>> Adam

View raw message