hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Magana-zook, Steven Alan" <maganazo...@llnl.gov>
Subject Re: Is it possible for HTable.put(Š) to not make it into the table and silently fail?
Date Fri, 22 Aug 2014 17:23:29 GMT
Hi Anoop,

I am using HBase 0.98.0.2.1.2.0-402-hadoop2 without the coprocessor
modification you mentioned. I merely threw the idea of a silent fail out
because I do catch and report Exception and Throwable on the client side,
and I see no reported errors (except for occasional Region too busy) that
would account for missing rows.

Thanks,
Steven



On 8/22/14 10:08 AM, "Anoop John" <anoop.hbase@gmail.com> wrote:

>>Is it possible that the put method call on Htable does not actually put
>the record in the database while also not throwing an exception?
>
>You can.  Implement a region CP (implementing RegionObserver) and
>implement
>prePut() . In this u can bypass the operation using
>ObserverContext#bypass().
>So core will not throw exception and wont add data also
>
>-Anoop-
>
>On Fri, Aug 22, 2014 at 10:23 PM, Ted Yu <yuzhihong@gmail.com> wrote:
>
>> bq. the result from the RowCounter program is far fewer records than I
>> expected.
>>
>> Can you give more detailed information about the gap ?
>>
>> Which hbase release are you running ?
>>
>> Cheers
>>
>>
>> On Fri, Aug 22, 2014 at 9:26 AM, Magana-zook, Steven Alan <
>> maganazook1@llnl.gov> wrote:
>>
>> > Hello,
>> >
>> > I have written a program in Java that is supposed to update rows in a
>> > Hbase table that do not yet have a value in a certain column (blob
>>values
>> > of between 5k and 50k). The program keeps track of how many puts have
>> been
>> > added to the table along with how long the program is running. These
>> pieces
>> > of information are used to calculate a speed for data ingestion
>>(records
>> > per second). After running the program for multiple days, and based on
>> the
>> > average speed reported, the result from the RowCounter program is far
>> fewer
>> > records than I expected. The essential parts of the code are shown
>>below
>> > (error handling and other potentially not important code omitted)
>>along
>> > with the command I use to see how many rows have been updated.
>> >
>> > Is it possible that the put method call on Htable does not actually
>>put
>> > the record in the database while also not throwing an exception?
>> > Could the output of RowCounter be incorrect?
>> > Am I doing something below that is obviously incorrect?
>> >
>> > Row counter command (does frequently report
>> OutOfOrderScannerNextException
>> > during execution): hbase org.apache.hadoop.hbase.mapreduce.RowCounter
>> > mytable cf:BLOBDATACOLUMN
>> >
>> > Code that is essentially what I am doing in my program:
>> > ...
>> > Scan scan = new Scan();
>> > scan.setCaching(200);
>> >
>> > HTable targetTable = new HTable(hbaseConfiguration,
>> > Bytes.toBytes(tblTarget));
>> > targetTable.getScanner(scan);
>> >
>> > int batchSize = 10;
>> > Date startTime = new Date();
>> > numFilesSent = 0;
>> >
>> > Result[] rows = resultScanner.next(batchSize);
>> > while (rows != null) {
>> > for (Result row : rows) {
>> > byte[] rowKey = row.getRow();
>> > byte[] byteArrayBlobData = getFileContentsForRow(rowKey);
>> >
>> > Put put = new Put(rowKey);
>> > put.add(COLUMN_FAMILY, BLOB_COLUMN, byteArrayBlobData);
>> > targetTable.put(put); // Auto-flush is on by default
>> > numFilesSent++;
>> > float elapsedSeconds = (new Date().getTime() - startTime.getTime()) /
>> > 1000.0f;
>> > float speed = numFilesSent / elapsedSeconds;
>> > System.out.println("Speed(rows/sec): " + speed); // routinely says
>>from
>> 80
>> > to 200+
>> > }
>> > rows = resultScanner.next(batchSize);
>> > }
>> > ...
>> >
>> > Thanks,
>> > Steven
>> >
>>


Mime
View raw message