hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vuk Ercegovac (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1519) mapreduce input and output formats to go against hbase
Date Thu, 28 Jun 2007 02:12:26 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12508710
] 

Vuk Ercegovac commented on HADOOP-1519:
---------------------------------------

Thanks for the feedback and apologies for the initial tgz file. Let me know if the patch works
for you.

I included a sample driver, org.apache.hadoop.mapred.TableJobExample, that scans an input
table's columns and writes to an output table. The input/output tables along with the columns
to scan are user specified. Filtering can be done by extending TableMap. Specifying a row
range and versions are good suggestions.

I tried ArrayWritable but ran into the following problem at line 444 of MapTask.java. Value
is instantiated
with the given class, say ArrayWritable, using its empty constructor. Then in line 459, value.readFields
is called. At this point, the valueClass in ArrayWritable is null, since it was not instantiated
or set appropriately. However, ArrayWritable assumes that it is set, rather than reading it
off the stream. My workaround is through RecordWritable but am certainly open to better suggestions.

I have tried this code using MiniHBaseCluster and on a distributed cluster (thanks for the
new start/stop scripts!) for the simple example of copying tables.

> mapreduce input and output formats to go against hbase
> ------------------------------------------------------
>
>                 Key: HADOOP-1519
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1519
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: contrib/hbase
>            Reporter: stack
>            Assignee: Jim Kellerman
>         Attachments: hbaseMR.tgz, patch.txt
>
>
> Inputs should allow specification of row range, columns and column versions.  Outputs
should allow specification of where to put the mapreduce result in hbase

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message