hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Oleg Ruchovets <oruchov...@gmail.com>
Subject Re: bulk loading problem
Date Wed, 29 Aug 2012 14:22:22 GMT
Great.
It works !!!!

On Tue, Aug 28, 2012 at 6:42 PM, Igal Shilman <igals@wix.com> wrote:

> As suggested by the book, take a look at:
> org.apache.hadoop.hbase.mapreduce.LoadIncrementalHFiles class,
>
> This tool expects two arguments: (1) the path to the generated HFiles (in
> your case it's outputPath) (2) the target table.
> To use it programatically, you can either invoke it via the ToolRunner, or
> calling LoadIncrementalHFiles.doBulkLoad() by yourself.
> (after your M/R job has successfully finished)
>
> If you are already loading to an existing table, then: (following your
> code)
>
> LoadIncrementalHFiles loader = new LoadIncrementalHFiles(conf);
> > int ret = loader.doBulkLoad(new Path(outputPath), new HTable(conf,
> > tableName));
>
>
> Otherwise,
>
>
> > int ret = ToolRunner.run(new LoadIncrementalHFiles(conf),
> >             new String[] {outputPath, tableName});
>
>
>
> Good luck,
> Igal.
>
> On Tue, Aug 28, 2012 at 10:59 PM, Oleg Ruchovets <oruchovets@gmail.com
> >wrote:
>
> > Hi Igal , thank you for the quick response  .
> >    Can I execute this step programmatically?
> >
> > From link you sent :
> >
> > 9.8.5. Advanced Usage
> >
> > Although the importtsv tool is useful in many cases, advanced users may
> > want to generate data programatically, or import data from other formats.
> > To get started doing so, dig into ImportTsv.java and check the JavaDoc
> for
> > HFileOutputFormat.
> >
> > The import step of the bulk load can also be done programatically. See
> the
> > LoadIncrementalHFiles class for more information.
> > The question is : what should I do/add to my job to write generated
> HFiles
> > programmatically to Hbase?
> >
> >
> >
> >
> > On Tue, Aug 28, 2012 at 8:08 PM, Igal Shilman <igals@wix.com> wrote:
> >
> > > Hi,
> > > You need to complete the bulk load.
> > > Check out http://hbase.apache.org/book/arch.bulk.load.html 9.8.2
> > >
> > > Igal.
> > >
> > > On Tue, Aug 28, 2012 at 7:29 PM, Oleg Ruchovets <oruchovets@gmail.com
> > > >wrote:
> > >
> > > > Hi ,
> > > >    I am on process to write my first bulk loading job. I use Cloudera
> > > > CDH3U3 with hbase 0.90.4
> > > >
> > > > Executing a job I see HFiles   which created after job finished but
> > there
> > > > were  no entries in hbase. hbase shell >> count  'uu_bulk'  return
0.
> > > >
> > > > Here is my job configuration:
> > > >
> > > >         Configuration  conf =  HBaseConfiguration.create();
> > > >
> > > >        Job job = new Job(conf, getClass().getSimpleName());
> > > >
> > > >         job.setJarByClass(UuPushMapReduceJobFactory.class);
> > > >         job.setMapperClass(UuPushMapper.class);
> > > >         job.setMapOutputKeyClass(ImmutableBytesWritable.class);
> > > >         job.setMapOutputValueClass(KeyValue.class);
> > > >         job.setOutputFormatClass(HFileOutputFormat.class);
> > > >
> > > >
> > > >
> > > >         String path = uuAggregationContext.getUuInputPath();
> > > >         String outputPath =
> > > > "/bulk_loading_hbase/output/"+System.currentTimeMillis();
> > > >         LOG.info("path = " + path);
> > > >         LOG.info("outputPath = " + outputPath);
> > > >
> > > >         final String tableName = "uu_bulk";
> > > >         LOG.info("hbase tableName: " + tableName);
> > > >         createRegions(conf , Bytes.toBytes(tableName));
> > > >
> > > >         FileInputFormat.addInputPath(job, new Path(path));
> > > >         FileOutputFormat.setOutputPath(job, new Path(outputPath));
> > > >
> > > >         HFileOutputFormat.configureIncrementalLoad(job, new
> > HTable(conf,
> > > > tableName));
> > > >
> > > >
> > >
> >
> //=====================================================================================
> > > > Reducers log ends
> > > >
> > > > 2012-08-28 11:53:40,643 INFO org.apache.hadoop.mapred.Merger: Down to
> > > > the last merge-pass, with 10 segments left of total size: 222885367
> > > > bytes
> > > > 2012-08-28 11:53:54,137 INFO
> > > > org.apache.hadoop.hbase.mapreduce.HFileOutputFormat:
> > > >
> > > >
> > >
> >
> Writer=hdfs://hdn16/bulk_loading_hbase/output/1346194117045/_temporary/_attempt_201208260949_0026_r_000005_0/d/3908303205246218823,
> > > > wrote=268435455
> > > > 2012-08-28 11:54:11,966 INFO org.apache.hadoop.mapred.Task:
> > > > Task:attempt_201208260949_0026_r_000005_0 is done. And is in the
> > > > process of commiting
> > > > 2012-08-28 11:54:12,975 INFO org.apache.hadoop.mapred.Task: Task
> > > > attempt_201208260949_0026_r_000005_0 is allowed to commit now
> > > > 2012-08-28 11:54:13,007 INFO
> > > > org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter: Saved
> > > > output of task 'attempt_201208260949_0026_r_000005_0' to
> > > > /bulk_loading_hbase/output/1346194117045
> > > > 2012-08-28 11:54:13,009 INFO org.apache.hadoop.mapred.Task: Task
> > > > 'attempt_201208260949_0026_r_000005_0' done.
> > > > 2012-08-28 11:54:13,010 INFO
> > > > org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs'
> > > > truncater with mapRetainSize=-1 and reduceRetainSize=-1
> > > >
> > > > As I understand HFiles were written
> > > > to /bulk_loading_hbase/output/1346194117045 but I don't see any
> > activity
> > > > related to moving HFiles to hbase.
> > > >
> > > >
> > > > What I am doing wrong? What should to get the result to be  written
> to
> > > >  Hbase?
> > > >
> > > > Thanks in advance
> > > > Oleg.
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message