hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anoop Sam John (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-10854) Multiple Row/VisibilityLabels visible while in the memstore
Date Fri, 28 Mar 2014 13:03:21 GMT

    [ https://issues.apache.org/jira/browse/HBASE-10854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13950663#comment-13950663
] 

Anoop Sam John commented on HBASE-10854:
----------------------------------------

This is not the case with MemStore items alone.  Consider the case of having a cell (with
label) being written. After this a flush is happened. So one cell in that HFile.  A diff version
of the same cell is being written again (diff label) and this is being flushed. Now there
are 2 cells in 2 HFiles and make sure no compaction is happening. Similar scenario described
here can happen now.  After a compaction the behaviour will change.
By default the max version for a CF is 1. And so flushes and compactions will make sure to
write only 1 cell version in these cases.
During scan, even if we specify some maxversion count in scan what we take is the min of both
these versions number and which will come as 1 here. 
The visibility based evaluation and cell filtering will happen in Filter level while on a
top layer (after this filtering) the filtering based on the number of max versions will happen.
(In SQM)
So to fix this problem, we have to consider the min version number used in SQM at lower layers
also.. (Readers)

bq.Second, we should agree on what is the correct behavior for schemas supporting multiple
versions, with multiple cell versions with differing visibility expressions among the versions
IMO in this case we have to consider all cells and which version's visibility support viewing
by the user, we have to return.



> Multiple Row/VisibilityLabels visible while in the memstore
> -----------------------------------------------------------
>
>                 Key: HBASE-10854
>                 URL: https://issues.apache.org/jira/browse/HBASE-10854
>             Project: HBase
>          Issue Type: Bug
>          Components: security
>    Affects Versions: 0.98.1
>            Reporter: Matteo Bertozzi
>
> If we update the row multiple times with different visibility labels
> we are able to get the "old version" of the row until is flushed
> {code}
> $ sudo -u hbase hbase shell
> hbase> add_labels 'A'
> hbase> add_labels 'B'
> hbase> create 'tb', 'f1'
> hbase> put 'tb', 'row', 'f1:q', 'v1', {VISIBILITY=>'A'}
> hbase> put 'tb', 'row', 'f1:q', 'v1all'
> hbase> put 'tb', 'row', 'f1:q', 'v1aOrB', {VISIBILITY=>'A|B'}
> hbase> put 'tb', 'row', 'f1:q', 'v1aAndB', {VISIBILITY=>'A&B'}
> hbase> scan 'tb'
> row column=f1:q, timestamp=1395948168154, value=v1aAndB
> 1 row
> $ sudo -u testuser hbase shell
> hbase> scan 'tb'
> row column=f1:q, timestamp=1395948168102, value=v1all
> 1 row
> {code}
> When we flush the memstore we get a single row (the last one inserted)
> so the testuser get 0 rows now.
> {code}
> $ sudo -u hbase hbase shell
> hbase> flush 'tb'
> hbase> scan 'tb'
> row column=f1:q, timestamp=1395948168154, value=v1aAndB
> 1 row
> $ sudo -u testuser hbase shell
> hbase> scan 'tb'
> 0 row
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message