hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Guanghao Zhang <zghao...@gmail.com>
Subject Re: Inconsistent behavior of scan with filter and maxVersions?
Date Tue, 28 Feb 2017 01:37:45 GMT
There is a issue HBASE-17125 about this inconsistent problem but we didn't
have a final solution. You can take a look about the discussion. Thanks.

2017-02-28 9:26 GMT+08:00 Huaxiang Sun <hsun@cloudera.com>:

> Hi HBase Devs,
>
>     Nicolae Popa found an inconsistent behavior when doing scan with
> filter, there is maxVersions configured for column family.
>     Start with the example.
>
> hbase(main):001:0> create 't1', {NAME => 'f1', VERSIONS => 1}
> hbase(main):002:0> put 't1', 'r1', 'f1:q1', 'a'
> hbase(main):003:0> put 't1', 'r1', 'f1:q1', ‘b'
>
> // There are two versions for r1, f1:q1
>
> hbase(main):004:0> scan 't1'
> ROW                                                  COLUMN+CELL
>  r1                                                  column=f1:q1,
> timestamp=1488244089712, value=b
> 1 row(s)
>
> // Scan with value filter ‘a’, returns the cell for ‘a’, even maxVersions
> is configured to be 1
> hbase(main):006:0> scan 't1', {FILTER => "ValueFilter(=,'binary:a')"}
> ROW                                                  COLUMN+CELL
>  r1                                                  column=f1:q1,
> timestamp=1488244087738, value=a
> 1 row(s)
> hbase(main):007:0> scan 't1', {FILTER => "ValueFilter(=,'binary:b')"}
> ROW                                                  COLUMN+CELL
>  r1                                                  column=f1:q1,
> timestamp=1488244089712, value=b
> 1 row(s)
>
> // After flush and major compaction, the older version is deleted from
> hfile.
> hbase(main):011:0> flush 't1'
> hbase(main):012:0> major_compact 't1'
> hbase(main):013:0> scan 't1', {FILTER => "ValueFilter(=,'binary:b')"}
> ROW                                                  COLUMN+CELL
>  r1                                                  column=f1:q1,
> timestamp=1488244089712, value=b
> 1 row(s)
>
> //Scan with value filter ‘a’, returns nothing now.
> hbase(main):014:0> scan 't1', {FILTER => "ValueFilter(=,'binary:a')"}
> ROW                                                  COLUMN+CELL
> 0 row(s)
> hbase(main):015:0>
>
> In the above example, the scan result for valueFilter ‘a” is inconsistent
> across flush and major compaction. The reason is that when filter returns
> SKIP, the version count is not increased. The older version is treated as
> the latest version.
>
> Is this the expected behavior? when maxVersions is specified in HCD, is
> user supposed to see the latest maxVersions or it could be affected by
> filters? It is not a raw scan in this example.
>
> Thanks,
> Huaxiang Sun

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message