hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Hsieh (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-10079) Increments lost after flush
Date Wed, 04 Dec 2013 20:06:35 GMT

    [ https://issues.apache.org/jira/browse/HBASE-10079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13839291#comment-13839291
] 

Jonathan Hsieh commented on HBASE-10079:
----------------------------------------

Seems like reverting either HBASE-9963 or HBASE-10014 gets rid of the "jagged" losses due
to flushes.  However when testing on the tip of 0.96 with the reverts I seem to be losing
some threads as the initialize becuase of some sort of race.  

I'm going to try from the exact point where 0.96.1rc1 was cut to see if it is an a happy place
any will investigate the htable initialization problem afterwards.

> Increments lost after flush 
> ----------------------------
>
>                 Key: HBASE-10079
>                 URL: https://issues.apache.org/jira/browse/HBASE-10079
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.96.1
>            Reporter: Jonathan Hsieh
>            Priority: Blocker
>             Fix For: 0.96.1
>
>
> Testing 0.96.1rc1.
> With one process incrementing a row in a table, we increment single col.  We flush or
do kills/kill-9 and data is lost.  flush and kill are likely the same problem (kill would
flush), kill -9 may or may not have the same root cause.
> 5 nodes
> hadoop 2.1.0 (a pre cdh5b1 hdfs).
> hbase 0.96.1 rc1 
> Test: 250000 increments on a single row an single col with various number of client threads
(IncrementBlaster).  Verify we have a count of 250000 after the run (IncrementVerifier).
> Run 1: No fault injection.  5 runs.  count = 250000. on multiple runs.  Correctness verified.
 1638 inc/s throughput.
> Run 2: flushes table with incrementing row.  count = 246875 !=250000.  correctness failed.
 1517 inc/s throughput.  
> Run 3: kill of rs hosting incremented row.  count = 243750 != 250000. Correctness failed.
  1451 inc/s throughput.
> Run 4: one kill -9 of rs hosting incremented row.  246878.!= 250000.  Correctness failed.
1395 inc/s (including recovery)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message