hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Taylor, Ronald C" <ronald.tay...@pnl.gov>
Subject RE: Bulk import - is the error general to both MapReduce and non-MapReduce programs?
Date Thu, 02 Apr 2009 21:37:47 GMT

I have been following this thread, and got a question. I am new to Hbase coding, and I have
within the past few days written a standalone (not MapReduce based) Java program to do a bulk
upload into one Hbase table. I believe that I got the same error that you folks have been
talking about. The program works fine on small uploads, fails with the error msg you mention
when moving to import of ten of thousands of rows. So - I wanted to ask: has this import error
been reported for only MapReduce-based programs, or is it indeed more general (which I could
then assume may be something that affects by current import program, and I should try using
the doCommit() code shown below as a fix)?
  Ron Taylor
Ronald Taylor, Ph.D.
Computational Biology & Bioinformatics Group
Pacific Northwest National Laboratory
902 Battelle Boulevard
P.O. Box 999, MSIN K7-90
Richland, WA  99352 USA
Office:  509-372-6568
Email: ronald.taylor@pnl.gov

-----Original Message-----
From: Stuart White [mailto:stuart.white1@gmail.com] 
Sent: Thursday, April 02, 2009 1:37 PM
To: hbase-user@hadoop.apache.org
Subject: Re: Bulk import - does sort order of input data affect success rate?

On Thu, Apr 2, 2009 at 3:30 PM, Ryan Rawson <ryanobjc@gmail.com> wrote:
> The last thing - success should not be a function of sort order.
> However, speed will be related.

How?  Sorted = faster, or Sorted = slower?

> One thing I found I had to do was:
>    private void doCommit(HTable t, BatchUpdate update) throws 
> IOException {
>      boolean commited = false;
>      while (!commited) {
>        try {
>          t.commit(update);
>          commited = true;
>        } catch (RetriesExhaustedException e) {
>          // DAMN, ignore
>        }
>      }
>    }

I'm running a mapred job, using TableOutputFormat to write the results to HBase.  For the
code you've provided, was that for a custom output format?  Or a standalone (non-mapred) application?
 I see the point you're making, I just don't understand where I'd put that code.

View raw message