hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ondřej Stašek <ondrej.sta...@firma.seznam.cz>
Subject Re: Problems with scan after lot of Puts
Date Thu, 31 May 2012 10:05:00 GMT
Hallo J-D.

   Thanks for reply. I've modified my code to use scanner copies - 
table.getScanner(new Scan(scan)) and run it again. Even after that I got 
an error:

12/05/31 10:42:39 INFO hbase.TestPutScan: Run 5 put 1000000 rows
12/05/31 10:44:09 INFO hbase.TestPutScan: Run 5 scan + del every 10th row
12/05/31 10:44:33 ERROR hbase.TestPutScan: Expected value: value 0402040 
0000005, got: value 0402041 0000004

It seems that 1 row was skipped during scan. Strange.

I'll keep testing.

   Ondrej Stasek

On 30.5.2012 21:05, Jean-Daniel Cryans wrote:
> There you go:
>
> 12/05/30 18:54:17 DEBUG client.MetaScanner: Scanning .META. starting
> at row=testtable,,00000000000000 for max=10 rows using
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@f593af
> 12/05/30 18:54:17 DEBUG
> client.HConnectionManager$HConnectionImplementation: Cached location
> for testtable,test_row_0496107,1338404055995.e9c7a4ca97eb2be372445af4d3772031.
> is sv4r25s44:62023
> 12/05/30 18:54:17 DEBUG
> client.HConnectionManager$HConnectionImplementation: Removed
> testtable,,1338404055995.9389fe5538f19a6f2df27e3958dcb434. for
> tableName=testtable from cache because of test_row_0012550
> 12/05/30 18:54:17 DEBUG
> client.HConnectionManager$HConnectionImplementation: Cached location
> for testtable,,1338404055995.9389fe5538f19a6f2df27e3958dcb434. is
> sv4r25s44:62023
> 12/05/30 18:57:47 INFO hbase.TestPutScan: Run 5 scan
> 12/05/30 18:57:47 ERROR hbase.TestPutScan: Expected value: value
> 0000001 0000005, got: value 0496107 0000005
>
> That's a split so the ClientScanner did a reset on the start row. So
> I'm going to fix your code and see if I can get anything else.
>
> J-D
>
> On Wed, May 30, 2012 at 11:56 AM, Jean-Daniel Cryans
> <jdcryans@apache.org>  wrote:
>> I'm running it here, but I just remembered about this issue:
>>
>> "HTable.ClientScanner needs to clone the Scan object"
>> https://issues.apache.org/jira/browse/HBASE-4891
>>
>> And since you are reusing that Scan object, you could definitely hit this issue.
>>
>> J-D
>>
>> On Tue, May 29, 2012 at 11:37 PM, Ondřej Stašek
>> <ondrej.stasek@firma.seznam.cz>  wrote:
>>> Here it is:
>>>
>>> http://pastebin.com/0AgsQjur
>>>
>>>
>>> On 29.5.2012 22:44, Jean-Daniel Cryans wrote:
>>>> Care to share that TestPutScan? Just attach it in a pastebin
>>>>
>>>> Thx,
>>>>
>>>> J-D
>>>>
>>>> On Tue, May 29, 2012 at 6:13 AM, Ondřej Stašek
>>>> <ondrej.stasek@firma.seznam.cz>    wrote:
>>>>> My program writes changes to HBase table by issuing lots of Puts
>>>>> (autoCommit
>>>>> turned off, flush on end) and afterwards uses ResultScanner on whole
>>>>> table
>>>>> to read all rows and act upon them. My problem is that on several
>>>>> occasions
>>>>> scan does not return expected rows. Either scan does not start on the
>>>>> beginning of table or somewhere during scan I got old data (not those
>>>>> written by Puts before).
>>>>>
>>>>> I have even written simple test application to simulate this behavior:
>>>>> 1. write 1M simple numbered rows to a table
>>>>> 2. scan through table to test output, delete every 10th row
>>>>> 3. scan again after delete
>>>>> 4. repeat until error found
>>>>>
>>>>> Sample output:
>>>>>
>>>>> 12/05/29 00:32:12 INFO hbase.TestPutScan: Run 342 put 1000000 rows
>>>>> 12/05/29 00:32:35 INFO hbase.TestPutScan: Run 342 scan + del every 10th
>>>>> row
>>>>> 12/05/29 00:33:29 INFO hbase.TestPutScan: Run 342 scan
>>>>> 12/05/29 00:33:29 ERROR hbase.TestPutScan: Expected value: value 0000001
>>>>> 0000342, got: value 0281999 0000342
>>>>>
>>>>> This means, that program expected to get first row, but got 281999th.
>>>>>
>>>>> This test ran on "minicluster" of 2 regionservers runing Cloudera's
>>>>> cdh3u4
>>>>> distribution.
>>>>>
>>>>> Today I got 3 errors like that and from RS's log it seems that in the
>>>>> same
>>>>> time hbase balancer issued reassign command for this table region (table
>>>>> have only 1 region).
>>>>>
>>>>> Any pointers on what to check or what to send you to help resolve this
>>>>> issue?
>>>>>
>>>>> Regards
>>>>>
>>>>> Ondrej Stasek
>>>>>
>>>
>>> --
>>> Ondřej Stašek
>>> Programátor senior
>>> Seznam.cz, a.s.
>>> Nádražní 159/21
>>> 370 01 České Budějovice 6
>>>
>>> tel.: +420 386 325 467
>>> gsm: +420 603 857 602
>>> icq: 164660005
>>> ondrej.stasek@firma.seznam.cz
>>> http://www.seznam.cz
>>>

Mime
View raw message