hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Kwan <thomas.k...@manage.com>
Subject Hbase MR Job with 2 OutputForm classes possible?
Date Thu, 31 Jul 2014 00:50:15 GMT
Hi there,

I have a Hbase MR job that reads data from HDFS, do a Hbase Get, and then
do some data transformation. Then I need to put the data back  to Hbase as
well as write data to a HDFS file directory (so I can import it back into

The current job creation logic is similar to the following:

    public static Job createHBaseJob(Configuration conf, String []args)
    throws IOException {
        Path inputDir = new Path(args[0]);
        String tableName = args[1];
        String params = args[2];

        Job job = new Job(conf, NAME + "_" + tableName + " " + params);

        FileInputFormat.setInputPaths(job, inputDir);

        // No reducers.  Just write straight to table.  Call
        // to set up the TableOutputFormat.
        TableMapReduceUtil.initTableReducerJob(tableName, null, job);

        return job;

TableMapReduceUtil.initTableReducerJob is already setup the OutputFormat
class.  I wonder if there is magic that I can do to pipe the data to a HDFS
file as well. Currently I just have 2 jobs. One writes to Hbase and one
writes HDFS. But in the current setup, I need to do the Hbase get twice.

Any input is highly welcome!!


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message