hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lars hofhansl <lhofha...@yahoo.com>
Subject Re: delete operation with timestamp
Date Mon, 28 Nov 2011 23:56:08 GMT
Hi Yi,
the reason is that nothing is ever changed in-place in HBase, only new files are created (with
the exception of the WAL, which is appended to,
and some special scenario like atomic increment and atomic appends, where older version of
the cells are removed from the memstore).

That caters very well to the performance characteristics of the underlying distributed file
system (HDFS).


Consequently deleted rows are not actually deleted right away, we just record the fact the
rows should not be visible anymore and can eventually be removed.
The actual removal happens during the next compaction when new files are created.

Sometimes that does lead to unexpected behaviors such as the one you describe below.

In the trunk version of HBase I introduced the possibility to perform time-range queries that
can "peek" behind delete markers to retrieve cells that are marked as deleted. (HBASE-4536)

-- Lars


----- Original Message -----
From: Yi Liang <whitesky@gmail.com>
To: user@hbase.apache.org
Cc: 
Sent: Thursday, November 24, 2011 10:11 PM
Subject: Re: delete operation with timestamp

Thanks Daniel for your explanation. But still curious why we do such
design, it's unexpected for me.

Also, this behavior of deleteColumns make delete operation not very user
friendly, why not use deleteColumn instead in hbase shell and thrift client?

Thanks,
Yi

2011/11/24 Daniel Gómez Ferro <danielgf@yahoo-inc.com>

>
> On Nov 24, 2011, at 08:38 , Yi Liang wrote:
>
> > We're using hbase-0.90.3 with thrift client, and have encountered some
> > problems when we want to delete one specific version of a cell.
> >
> > First, there's no corresponding thrift api for Delete#deleteColumn(byte
> []
> > family, byte [] qualifier, long timestamp). Instead, deleteColumns is
> > supported in mutateRowTs.  But what we want is deleteColumn as we need to
> > keep the older versions. IMO, we should implement mutateRowTs
> > with deleteColumn, rather than deleteColumns. The hbase shell's delete
> > command has the same problem.
> >
> > Second, we find we can't reinsert any older cell if we have deleted that
> > cell with deleteColumns. For example:
> > hbase(main):007:0> scan 'test3'
> > ROW                                           COLUMN+CELL
> > 0 row(s) in 0.0110 seconds
> >
> > hbase(main):008:0> put 'test3', 'r1', 'f1:c1', 'old', 1315550678308
> > 0 row(s) in 0.0100 seconds
> >
> > hbase(main):009:0> scan 'test3'
> > ROW                                           COLUMN+CELL
> > r1                                           column=f1:c1,
> > timestamp=1315550678308, value=old
> > 1 row(s) in 0.0290 seconds
> >
> > hbase(main):012:0> put 'test3', 'r1', 'f1:c1', 'new'
> > 0 row(s) in 0.0090 seconds
> >
> > hbase(main):013:0> scan 'test3'
> > ROW                                           COLUMN+CELL
> > r1                                           column=f1:c1,
> > timestamp=1322119570316, value=new
> > 1 row(s) in 0.0140 seconds
> >
> > hbase(main):014:0> delete 'test3', 'r1', 'f1:c1', 1322119570316
> > 0 row(s) in 0.0130 seconds
> >
> > hbase(main):015:0> scan 'test3'
> > ROW                                           COLUMN+CELL
> > 0 row(s) in 0.0120 seconds
> >
> > hbase(main):016:0> put 'test3', 'r1', 'f1:c1', 'old', 1315550678308
> > 0 row(s) in 0.0090 seconds
> >
> > hbase(main):017:0> scan 'test3'
> > ROW                                           COLUMN+CELL
> > 0 row(s) in 0.0110 seconds
> >
> > There's no error message when we reinsert the old version, so we think it
> > has succeeded, but actually it's not. It looks like a bug.
> >
> > What's your opinion?
> >
>
> Hi,
>
> The second point is not a bug, it's how HBase is designed. Any delete
> (except deleteColumn) inserts a tombstone marker which masks any older
> value, so even if you insert later an older value it will be masked by the
> tombstone. You can see some nice examples here:
> http://outerthought.org/blog/417-ot.html
>
> There is also a new feature in trunk that allows you to retrieve masked
> values through a "raw scan" or a get with a timeRange that excludes the
> delete: https://issues.apache.org/jira/browse/HBASE-4536
>
> Daniel
>
> > Thanks,
> > Yi
>
>


Mime
View raw message