hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Is it possible for HTable.put(Š) to not make it into the table and silently fail?
Date Fri, 22 Aug 2014 16:53:40 GMT
bq. the result from the RowCounter program is far fewer records than I
expected.

Can you give more detailed information about the gap ?

Which hbase release are you running ?

Cheers


On Fri, Aug 22, 2014 at 9:26 AM, Magana-zook, Steven Alan <
maganazook1@llnl.gov> wrote:

> Hello,
>
> I have written a program in Java that is supposed to update rows in a
> Hbase table that do not yet have a value in a certain column (blob values
> of between 5k and 50k). The program keeps track of how many puts have been
> added to the table along with how long the program is running. These pieces
> of information are used to calculate a speed for data ingestion (records
> per second). After running the program for multiple days, and based on the
> average speed reported, the result from the RowCounter program is far fewer
> records than I expected. The essential parts of the code are shown below
> (error handling and other potentially not important code omitted) along
> with the command I use to see how many rows have been updated.
>
> Is it possible that the put method call on Htable does not actually put
> the record in the database while also not throwing an exception?
> Could the output of RowCounter be incorrect?
> Am I doing something below that is obviously incorrect?
>
> Row counter command (does frequently report OutOfOrderScannerNextException
> during execution): hbase org.apache.hadoop.hbase.mapreduce.RowCounter
> mytable cf:BLOBDATACOLUMN
>
> Code that is essentially what I am doing in my program:
> ...
> Scan scan = new Scan();
> scan.setCaching(200);
>
> HTable targetTable = new HTable(hbaseConfiguration,
> Bytes.toBytes(tblTarget));
> targetTable.getScanner(scan);
>
> int batchSize = 10;
> Date startTime = new Date();
> numFilesSent = 0;
>
> Result[] rows = resultScanner.next(batchSize);
> while (rows != null) {
> for (Result row : rows) {
> byte[] rowKey = row.getRow();
> byte[] byteArrayBlobData = getFileContentsForRow(rowKey);
>
> Put put = new Put(rowKey);
> put.add(COLUMN_FAMILY, BLOB_COLUMN, byteArrayBlobData);
> targetTable.put(put); // Auto-flush is on by default
> numFilesSent++;
> float elapsedSeconds = (new Date().getTime() - startTime.getTime()) /
> 1000.0f;
> float speed = numFilesSent / elapsedSeconds;
> System.out.println("Speed(rows/sec): " + speed); // routinely says from 80
> to 200+
> }
> rows = resultScanner.next(batchSize);
> }
> ...
>
> Thanks,
> Steven
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message