hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: HBase scanner semantics and inconsistencies.
Date Tue, 11 Oct 2011 00:03:06 GMT
Keep digging Jon.
St.Ack

On Mon, Oct 10, 2011 at 3:55 PM, Jonathan Hsieh <jon@cloudera.com> wrote:
> Yup, I'm pretty convinced this is a consistency and isolation problem.
>  Here's the output of my program, (expected 59 rows) and some output from
> problem rows.
>
> The row has 10 column families ("f0","f1"... "f9") with two cols.  There are
> actually two 'row0000021332' rows -- the 48 and the 49 versions.  Each came
> from a separate calls to next().  Notice how they are split between column
> families!   Over many runs this happens many times although "splitting"
> between different column families.
>
> scan items counted: 59 with 0 null ColumnFamilyMaps and 0 nullRows
> scan items counted: 59 with 0 null ColumnFamilyMaps and 0 nullRows
> scan items counted: 59 with 0 null ColumnFamilyMaps and 0 nullRows
> scan items counted: 60 with 0 null ColumnFamilyMaps and 0 nullRows
> Row row0000021332 had time stamps: [48:
> keyvalues={row0000021332/f0:data/1318200437341/Put/vlen=1000,
> row0000021332/f0:qual/1318200437341/Put/vlen=10,
> row0000021332/f1:data/1318200437341/Put/vlen=1000,
> row0000021332/f1:qual/1318200437341/Put/vlen=10,
> row0000021332/f2:data/1318200437341/Put/vlen=1000,
> row0000021332/f2:qual/1318200437341/Put/vlen=10,
> row0000021332/f3:data/1318200437341/Put/vlen=1000,
> row0000021332/f3:qual/1318200437341/Put/vlen=10,
> row0000021332/f4:data/1318200437341/Put/vlen=1000,
> row0000021332/f4:qual/1318200437341/Put/vlen=10,
> row0000021332/f5:data/1318200437341/Put/vlen=1000,
> row0000021332/f5:qual/1318200437341/Put/vlen=10}, 49:
> keyvalues={row0000021332/f6:data/1318200437341/Put/vlen=1000,
> row0000021332/f6:qual/1318200437341/Put/vlen=10,
> row0000021332/f7:data/1318200437341/Put/vlen=1000,
> row0000021332/f7:qual/1318200437341/Put/vlen=10,
> row0000021332/f8:data/1318200437341/Put/vlen=1000,
> row0000021332/f8:qual/1318200437341/Put/vlen=10,
> row0000021332/f9:data/1318200437341/Put/vlen=1000,
> row0000021332/f9:qual/1318200437341/Put/vlen=10}]
>
> Jon.
>
> On Mon, Oct 10, 2011 at 1:50 PM, Jonathan Hsieh <jon@cloudera.com> wrote:
>
>> I've modified scenario 2's filter to setFilterIfMissing(true) which I
>> missed in scenario 1 and now have an undercount like in that scenario. In
>> another experiement, I found that getting the latest version
>> Record.getColumnLatest(..)  would return null.
>>
>> This seems like a consistency+isolation violation of the expected row-acid
>> properties.
>>
>> ----
>> Scan s = new Scan();
>> ...
>> SingleColumnValueFilter filter = new SingleColumnValueFilter(
>>         Bytes.toBytes("f1"), Bytes.toBytes("qual"), CompareOp.EQUAL,
>> value);
>>     filter.setFilterIfMissing(true); // this is what drops events.
>>     s.setFilter(filter);
>>  ---
>>
>> Jon.
>>
>> On Mon, Oct 10, 2011 at 12:52 PM, Jonathan Hsieh <jon@cloudera.com> wrote:
>>
>>> I've working a problem related to ACID rules on scans as defined here
>>> http://hbase.apache.org/acid-semantics.html.
>>>
>>> In the two scenarios no new rowkeys are written -- all writes are
>>> "overwrites" of already existing rows.  So lets say the table has 3 rows:
>>> A1, B1, C1; and overwrites rows with values A2,B2,C2 respectively.   I'd
>>> expect all the scans would have the same number of elements (3) but with
>>> potentially any combination of A1|A2, B1|B2, C1|C2.  (since no acid
>>> guarantees across rows).
>>>
>>> Scenario 1: I have an MR job that does a filtered scan (confirmed this
>>> problem happens without filter as well)  from a table, takes the row and
>>> writes back to the same row and the same table.   I then run the job 2-3x
>>> concurrently on the same table with the same filter.  I believe both should
>>> return the same number of elements read.  It seems that in the case of a
>>> multiple column family table, this is not the case -- in particular
>>> sometimes MR counters reports that it had *fewer* than than the expected
>>> number of input records (but never more).  This seems wrong.  Agree?
>>>
>>> Scenario 2: When  trying to duplicate the problem excluding the MR
>>> portions, I wrote two programs -- one that does a filtered scan and that
>>> overwrites of existing rows, and one that just does the same filtered scan
>>> that just counts the number rows read.   I've actually also used the
>>> TableRecordReaderImpl code that the MR TableInputFormat uses.  In this case,
>>> sometimes the scan/counting job actually returns *too many* entries -- for
>>> some rowkeys return two records.  (but never fewer).  This should probably
>>> not  happen as well.  Agree?
>>>
>>> I've observed this on 0.90.1+patches and 0.90.3+patches.  It was also
>>> claimed that this issue was not seen in a 0.89.x based hbase.  Thus far I've
>>> been able to reproduce this on  multiple-column-family tables.
>>>
>>> I believe scenario 1's problem is related to these.  Concur?
>>> https://issues.apache.org/jira/browse/HBASE-2856
>>>  https://issues.apache.org/jira/browse/HBASE-3498
>>>
>>> I think scenario 2 is probably related but not sure if it is the same
>>> issue.
>>>
>>> Are there other related JIRA's?
>>>
>>> Any hints for where to hunt this down? (I'm starting to go into the scan
>>> and write paths on the RegionServer, but this is a fairly large area...
>>>
>>> Thanks
>>> Jon.
>>>
>>> --
>>> // Jonathan Hsieh (shay)
>>> // Software Engineer, Cloudera
>>> // jon@cloudera.com
>>>
>>>
>>>
>>
>>
>> --
>> // Jonathan Hsieh (shay)
>> // Software Engineer, Cloudera
>> // jon@cloudera.com
>>
>>
>>
>
>
> --
> // Jonathan Hsieh (shay)
> // Software Engineer, Cloudera
> // jon@cloudera.com
>

Mime
View raw message