hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Silberstein <silbe...@yahoo-inc.com>
Subject M/R on bulk imported tables
Date Tue, 25 May 2010 05:55:02 GMT
A colleague and I are working on testing a few HBase features, notably bulk
import (mentioned in
educe/package-summary.html) and running M/R jobs using HBase as input.

We¹re taking the following steps:
1a. Load HBase with a M/R job using the normal API.
1b. Load HBase with bulk import.


2a. Using the shell, do a ³count² over the table.
2b. Run a M/R job that scans the whole HBase table (and nothing else).

Of the 4 combos, 3 are fine: 1a+2a, 1a+2b, 1b+2a.  We¹re having trouble with
1b+2b.  When we run the M/R job, it doesn¹t seem to read in any records, but
there are no explicit errors in either the Hadoop or HBase logs.

This seems odd.  It shouldn¹t matter how we load the table, and the shell¹s
count operator seems to work correctly either way, counting all the records.
The M/R job in 2b is the same no matter how we load the table.  Any ideas on
what might be wrong with the bulk import to cause this problem?  We¹re
thinking maybe something with the region boundaries, although they look ok
in the GUI.  

Thanks for any suggestions,

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message