hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Koch <ogd...@googlemail.com>
Subject hbase.mapred.output.quorum ignored in Mapper job with HDFS source and HBase sink
Date Sun, 24 Mar 2013 12:46:25 GMT

I want to import a file on HDFS from one cluster A (source) into HBase
tables on a different cluster B (destination) using a Mapper job with an
HBase sink. Both clusters run HBase.

This setup works fine:
- Run Mapper job on cluster B (destination)
- "mapred.input.dir" --> hdfs://<cluster-A>/<path-to-file> (file on source
- "hbase.zookeeper.quorum" --> <quorum-hostname-B>
- "hbase.zookeeper.property.clientPort" --> <quorum-port-B>

I thought it should be possible to run the job on cluster A (source) and
using "hbase.mapred.output.quorum" to insert into the tables on cluster B.
This is what the CopyTable utility does. However, the following does not
work. HBase looks for the destination table(s) on cluster A and NOT cluster
- Run Mapper job on cluster A (source)
- "mapred.input.dir" --> hdfs://<cluster-A>/<path-to-file> (file is local)
- "hbase.zookeeper.quorum" --> <quorum-hostname-A>
- "hbase.zookeeper.property.clientPort" --> <quorum-port-A>
- "hbase.mapred.output.quorum" -> <quorum-hostname-B>:2181:/hbase (same as
--peer.adr argument for CopyTable)

Job setup inside the class MyJob is as follows, note I am using

Configuration conf = HBaseConfiguration.addHbaseResources(getConf());
Job job = new Job(conf);
// Note, several output tables!

Where The Mapper class has the following frame:

public static class JsonImporterMapper extends
    Mapper<LongWritable, Text, ImmutableBytesWritable, Put> { }

Is this expected behaviour? How can I get the second scenario using
hbase.mapred.output.quorum" to work? Could the fact I am using
MultiTableOutputFormat instead of TableOutputFormat play a part? I am using
HBase 0.92.1.

Thank you,


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message