hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From yeshwanth kumar <yeshwant...@gmail.com>
Subject Re: writing to multiple hbase tables in a mapreduce job
Date Tue, 26 Aug 2014 19:11:22 GMT
hi shahab,

i tried in that way, by specifying outputformat as MultiTableOutputFormat,
it is throwing

 java.io.IOException: No input paths specified in job
at
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:193)
at
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:252)
at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:919)
at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:936)
at org.apache.hadoop.mapred.JobClient.access$500(JobClient.java:170)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:854)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:807)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:807)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:465)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:495)
at
com.serendio.icvs.analysis.text.EntitySearcherMR.main(EntitySearcherMR.java:161)

here's the job config

               Job job = new Job(config, "GetEntitiesMR");
job.setJarByClass(EntitySearcherMR.class);
job.setMapperClass(EntitySearcherMapper.class);
job.setOutputFormatClass(MultiTableOutputFormat.class);
TableMapReduceUtil.addDependencyJars(job);
job.setNumReduceTasks(0);
                boolean b = job.waitForCompletion(true);
if (!b) {
throw new IOException("error with job!");
}

i am unable to figure out, what i am missing here,

-yeshwanth





On Wed, Aug 27, 2014 at 12:23 AM, Shahab Yunus <shahab.yunus@gmail.com>
wrote:

> You don't need to initialize the tables.
>
> You just need to specify the output format as MultipleTableOutputFormat
> class.
>
> Something like this:
> job.setOutputFormatClass(MultipleTableOutputFormat.class);
>
>
> Because if you see the code for MultipleTableOutputFormat, it creates the
> table on the fly and stores it in the internal map when you call
> context.write.
> When context.write is called:
>
>  @Override <
> http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b27/java/lang/Override.java#Override
> >
>
> 126 <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/mapreduce/MultiTableOutputFormat.java#126
> >
>
> <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/mapreduce/MultiTableOutputFormat.java#
> >
>
>     public void
> <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/mapreduce/MultiTableOutputFormat.java#
> >write(ImmutableBytesWritable
> <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/io/ImmutableBytesWritable.java#ImmutableBytesWritable
> >
> tableName, Writable action) throws IOException
> <
> http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b27/java/io/IOException.java#IOException
> >
> {
>
> 127 <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/mapreduce/MultiTableOutputFormat.java#127
> >
>
> <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/mapreduce/MultiTableOutputFormat.java#
> >
>
>       HTable <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/client/HTable.java#HTable
> >
> table = getTable
> <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/mapreduce/MultiTableOutputFormat.java#MultiTableOutputFormat.MultiTableRecordWriter.getTable%28org.apache.hadoop.hbase.io.ImmutableBytesWritable%29
> >(tableName);
>
>
>
> Which calls getTable() shown below which cr
>
> eates the table on the fly and stores it in the internal map :
>
>
>
>  HTable <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/client/HTable.java#HTable
> >
>  <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/mapreduce/MultiTableOutputFormat.java#
> >getTable(ImmutableBytesWritable
> <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/io/ImmutableBytesWritable.java#ImmutableBytesWritable
> >
> tableName) throws IOException
> <
> http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b27/java/io/IOException.java#IOException
> >
> {
>
> 99 <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/mapreduce/MultiTableOutputFormat.java#99
> >
>
> <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/mapreduce/MultiTableOutputFormat.java#
> >
>
>        if (!tables
> <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/mapreduce/MultiTableOutputFormat.java#MultiTableOutputFormat.MultiTableRecordWriter.0tables
> >.containsKey
> <
> http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b27/java/util/Map.java#Map.containsKey%28java.lang.Object%29
> >(tableName))
> {
>
> 100 <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/mapreduce/MultiTableOutputFormat.java#100
> >
>
> <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/mapreduce/MultiTableOutputFormat.java#
> >
>
>         LOG <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/mapreduce/MultiTableOutputFormat.java#MultiTableOutputFormat.MultiTableRecordWriter.0LOG
> >.debug
> <
> http://grepcode.com/file/repo1.maven.org/maven2/commons-logging/commons-logging/1.1.1/org/apache/commons/logging/Log.java#Log.debug%28java.lang.Object%29
> >("Opening
> HTable \"" + Bytes.toString
> <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/util/Bytes.java#Bytes.toString%28byte%5B%5D%29
> >(tableName.get
> <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/io/ImmutableBytesWritable.java#ImmutableBytesWritable.get%28%29
> >())+
> "\" for writing");
>
> 101 <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/mapreduce/MultiTableOutputFormat.java#101
> >
>
> <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/mapreduce/MultiTableOutputFormat.java#
> >
>
>         HTable <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/client/HTable.java#HTable
> >
> table = new HTable
> <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/client/HTable.java#HTable
> >(conf
> <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/mapreduce/MultiTableOutputFormat.java#MultiTableOutputFormat.MultiTableRecordWriter.0conf
> >,
> tableName.get <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/io/ImmutableBytesWritable.java#ImmutableBytesWritable.get%28%29
> >());
>
> 102 <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/mapreduce/MultiTableOutputFormat.java#102
> >
>
> <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/mapreduce/MultiTableOutputFormat.java#
> >
>
>         table.setAutoFlush
> <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/client/HTable.java#HTable.setAutoFlush%28boolean%29
> >(false);
>
> 103 <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/mapreduce/MultiTableOutputFormat.java#103
> >
>
> <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/mapreduce/MultiTableOutputFormat.java#
> >
>
>         tables <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/mapreduce/MultiTableOutputFormat.java#MultiTableOutputFormat.MultiTableRecordWriter.0tables
> >.put
> <
> http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b27/java/util/Map.java#Map.put%28org.apache.hadoop.hbase.io.ImmutableBytesWritable%2Corg.apache.hadoop.hbase.client.HTable%29
> >(tableName,
> table);
>
> 104 <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/mapreduce/MultiTableOutputFormat.java#104
> >
>
> <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/mapreduce/MultiTableOutputFormat.java#
> >
>
>       }
>
> 105 <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/mapreduce/MultiTableOutputFormat.java#105
> >
>
> <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/mapreduce/MultiTableOutputFormat.java#
> >
>
>       return tables
> <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/mapreduce/MultiTableOutputFormat.java#MultiTableOutputFormat.MultiTableRecordWriter.0tables
> >.get
> <
> http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b27/java/util/Map.java#Map.get%28java.lang.Object%29
> >(tableName);
>
> 106 <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/mapreduce/MultiTableOutputFormat.java#106
> >
>
> <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/mapreduce/MultiTableOutputFormat.java#
> >
>
>     }
>
> 107 <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/mapreduce/MultiTableOutputFormat.java#107
> >
>
> <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/mapreduce/MultiTableOutputFormat.java#
> >
>
> 108 <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/mapreduce/MultiTableOutputFormat.java#108
> >
>
> <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/mapreduce/MultiTableOutputFormat.java#
> >
>
>     @Override <
> http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b27/java/lang/Override.java#Override
> >
>
> 109 <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/mapreduce/MultiTableOutputFormat.java#109
> >
>
> <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/mapreduce/MultiTableOutputFormat.java#
> >
>
>     public void
> <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/mapreduce/MultiTableOutputFormat.java#
> >close(TaskAttemptContext
> context) throws IOException
> <
> http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b27/java/io/IOException.java#IOException
> >
> {
>
> 110 <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/mapreduce/MultiTableOutputFormat.java#110
> >
>
> <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/mapreduce/MultiTableOutputFormat.java#
> >
>
>       for (HTable
> <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/client/HTable.java#HTable
> >
> table : tables <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/mapreduce/MultiTableOutputFormat.java#MultiTableOutputFormat.MultiTableRecordWriter.0tables
> >.values
> <
> http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b27/java/util/Map.java#Map.values%28%29
> >())
> {
>
> 111 <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/mapreduce/MultiTableOutputFormat.java#111
> >
>
> <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/mapreduce/MultiTableOutputFormat.java#
> >
>
>         table.flushCommits
> <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/client/HTable.java#HTable.flushCommits%28%29
> >();
>
> 112 <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/mapreduce/MultiTableOutputFormat.java#112
> >
>
> <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/mapreduce/MultiTableOutputFormat.java#
> >
>
>       }
>
> 113 <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/mapreduce/MultiTableOutputFormat.java#113
> >
>
> <
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/mapreduce/MultiTableOutputFormat.java#
> >
>
>     }
>
>
> In fact, I would suggest to go through this code here for the whole class:
>
>
> http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hbase/hbase/0.92.1/org/apache/hadoop/hbase/mapreduce/MultiTableOutputFormat.java#MultiTableOutputFormat.MultiTableRecordWriter.getTable%28org.apache.hadoop.hbase.io.ImmutableBytesWritable%29
>
>
>
> It is different from TableOutputFormat approach where you do need to
> intialize the table by using the Util class.
>
>
>
> Regards,
>
> Shahab
>
>
>
> On Tue, Aug 26, 2014 at 2:29 PM, yeshwanth kumar <yeshwanth43@gmail.com>
> wrote:
>
> > hi ted,
> >
> > i need to process the data in table i1, and then i need to write the
> > results to tables i1 and i2
> > so input for the mapper in my mapreduce job is from hbase table, i1
> > whereas in WALPlayer input is HLogInputFormat,
> >
> > if i remove the statement as you said and specify  the inputformat
> > as TableInputFormat it is throwing "No table was provided " Exception
> > if i specify the input table as in the statements
> >
> > TableMapReduceUtil.initTableMapperJob(otherArgs[0], scan,
> > EntitySearcherMapper.class, ImmutableBytesWritable.class, Put.class,
> > job);//otherArgs[0]=i1
> >
> > mapper is not considering other table,
> > any suggestions to resolve  this issue,
> >
> > thanks,
> > yeshwanth
> >
> >
> >
> >
> > On Tue, Aug 26, 2014 at 10:39 PM, Ted Yu <yuzhihong@gmail.com> wrote:
> >
> > > Please take a look at WALPlayer.java in hbase where you can find
> example
> > of
> > > how MultiTableOutputFormat is used.
> > >
> > > Cheers
> > >
> > >
> > > On Tue, Aug 26, 2014 at 10:04 AM, yeshwanth kumar <
> yeshwanth43@gmail.com
> > >
> > > wrote:
> > >
> > > > hi ted,
> > > >
> > > > how can we intialise the mapper if i comment out those lines
> > > >
> > > >
> > > >
> > > > On Tue, Aug 26, 2014 at 10:08 PM, Ted Yu <yuzhihong@gmail.com>
> wrote:
> > > >
> > > > > TableMapReduceUtil.initTableMapperJob(otherArgs[0], scan,
> > > > > EntitySearcherMapper.class, ImmutableBytesWritable.class,
> Put.class,
> > > > > job);//otherArgs[0]=i1
> > > > >
> > > > > You're initializing with table 'i1'
> > > > > Please remove the above call and try again.
> > > > >
> > > > > Cheers
> > > > >
> > > > >
> > > > > On Tue, Aug 26, 2014 at 9:18 AM, yeshwanth kumar <
> > > yeshwanth43@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > hi i am running  HBase 0.94.20  on Hadoop 2.2.0
> > > > > >
> > > > > > i am using MultiTableOutputFormat,
> > > > > > for writing processed output to two different tables in hbase.
> > > > > >
> > > > > > here's the code snippet
> > > > > >
> > > > > > private ImmutableBytesWritable tab_cr = new
> ImmutableBytesWritable(
> > > > > > Bytes.toBytes("i1")); private ImmutableBytesWritable tab_cvs
=
> new
> > > > > > ImmutableBytesWritable( Bytes.toBytes("i2"));
> > > > > >
> > > > > > @Override
> > > > > > public void map(ImmutableBytesWritable row, final Result value,
> > > > > > final Context context) throws IOException, InterruptedException
{
> > > > > >
> > > > > > -----------------------------------------
> > > > > > Put pcvs = new Put(entry.getKey().getBytes());
> > > > > > pcvs.add("cf".getBytes(),"type".getBytes(),column.getBytes());
> > > > > > Put put = new Put(value.getRow());
> > > > > > put.add("Entity".getBytes(), "json".getBytes(),
> > > > > > entry.getValue().getBytes());
> > > > > > context.write(tab_cr, put);// table i1 context.write(tab_cvs,
> > > > > pcvs);//table
> > > > > > i2
> > > > > >
> > > > > > }
> > > > > >
> > > > > > job.setJarByClass(EntitySearcherMR.class);
> > > > > > job.setMapperClass(EntitySearcherMapper.class);
> > > > > > job.setOutputFormatClass(MultiTableOutputFormat.class); Scan
> scan =
> > > new
> > > > > > Scan(); scan.setCacheBlocks(false);
> > > > > > TableMapReduceUtil.initTableMapperJob(otherArgs[0], scan,
> > > > > > EntitySearcherMapper.class, ImmutableBytesWritable.class,
> > Put.class,
> > > > > > job);//otherArgs[0]=i1
> > > > > TableMapReduceUtil.initTableReducerJob(otherArgs[0],
> > > > > > null, job); job.setNumReduceTasks(0);
> > > > > >
> > > > > > mapreduce job fails by saying nosuchcolumnfamily "cf" exception,
> in
> > > > table
> > > > > > i1
> > > > > > i am writing data to two different columnfamilies one in each
> > table,
> > > cf
> > > > > > belongs to table i2.
> > > > > > does the columnfamilies should present in both tables??
> > > > > > is there anything i am missing
> > > > > > can someone point me in the right direction
> > > > > >
> > > > > > thanks,
> > > > > > yeshwanth.
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message