hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Azza Abouzeid (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HBASE-2615) M/R on bulk imported tables
Date Wed, 26 May 2010 16:50:40 GMT

     [ https://issues.apache.org/jira/browse/HBASE-2615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Azza Abouzeid updated HBASE-2615:

    Attachment: dummydata.tar.gz

Dummy HDFS data (properly formatted HFile) of 1000 records.

> M/R on bulk imported tables
> ---------------------------
>                 Key: HBASE-2615
>                 URL: https://issues.apache.org/jira/browse/HBASE-2615
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.20.3, 0.20.4
>         Environment: os.arch=amd64; os.version=2.6.9-67.ELsmp; java.version=1.6.0_15;
java.vendor=Sun Microsystems Inc.
>            Reporter: Azza Abouzeid
>         Attachments: dummydata.tar.gz
> We are bulk importing using loadtable.rb and running M/R jobs using HBase as input.
> We're taking the following steps:
> 1a. Load HBase with a M/R job using the normal API. 
> OR
> 1b. Load HBase with bulk import.
> 2a. Using the shell, do a "count" over the table.
> OR
> 2b. Run a M/R job that scans the whole HBase table (and nothing else).
> Of the 4 combos, 3 are fine: 1a+2a, 1a+2b, 1b+2a.  We're having trouble with 1b+2b. 
When we run the M/R job, it doesn't seem to read in any records, but there are no explicit
errors in either the Hadoop or HBase logs.
> Any ideas on what might be wrong with the bulk import to cause this problem?  We confirmed
this problem exists in both hbase-0.20.3 and hbase-0.20.4.
> We have created dummy data for you test the issue. This is the test case:
> After loading the data into HDFS. In hbase shell:
> create 'tiny', 'values'
> Execute: 
> {HBASE-HOME}/bin/hbase org.jruby.Main {HBASE-HOME}/bin/loadtable.rb tiny tinytable
> Then run the simple row counter
> {HADOOP-HOME}/bin/hadoop jar {HBASE-HOME}/hbase-0.20.x.jar rowcounter tiny values
> Notice that map input records read is always zero. We confirmed that other mapreduce
jobs do not execute the map function at all, always returning 0 records.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message