hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Segel <michael_se...@hotmail.com>
Subject Re: How to config hbase0.94.2 to retain deleted data
Date Mon, 22 Oct 2012 01:56:01 GMT
Lars, 

Like the secondary indexes,  doing remote updates to other region servers isn't necessarily
a bad thing. 

There are ways to mitigate some of the costs of the update to the second table. I mean the
actual update doesn't have to be synchronous.

HTH

-Mike

On Oct 21, 2012, at 7:23 PM, lars hofhansl <lhofhansl@yahoo.com> wrote:

> That'd work too. Requires the regionservers to make remote updates to other regionservers,
though. And you have to trap each and every change (Put, Delete, Increment, Append, RowMutations,
etc)
> 
> 
> Curious, why do you think this is better than using the keep-deleted-cells feature?
> (It might well be, just curious)
> 
> 
> -- Lars
> 
> 
> 
> ----- Original Message -----
> From: Michael Segel <michael_segel@hotmail.com>
> To: user@hbase.apache.org
> Cc: 
> Sent: Sunday, October 21, 2012 4:34 PM
> Subject: Re: How to config hbase0.94.2 to retain deleted data
> 
> I would suggest that you use your coprocessor to copy the data to a 'backup' table when
you mark them for delete. 
> Then as major compaction hits, the rows are deleted from the main table, but still reside
undeleted in your delete table. 
> Call it a history table. 
> 
> 
> On Oct 21, 2012, at 3:53 PM, yun peng <pengyunmomo@gmail.com> wrote:
> 
>> Hi, All,
>> I want to retain all deleted key-value pairs in hbase. I have tried to
>> config HColumnDescript as follow to make it return deleted.
>> 
>>   public void postOpen(ObserverContext<RegionCoprocessorEnvironment> e) {
>>     HTableDescriptor htd = e.getEnvironment().getRegion().getTableDesc();
>>     HColumnDescriptor hcd = htd.getFamily(Bytes.toBytes("cf"));
>>     hcd.setKeepDeletedCells(true);
>>     hcd.setBlockCacheEnabled(false);
>>   }
>> 
>> However, it does not work for me, as when I issued a delete and then query
>> by an older timestamp, the old data does not show up.
>> 
>> hbase(main):119:0> put 'usertable', "key1", 'cf:c1', "v1", 99
>> hbase(main):120:0> put 'usertable', "key1", 'cf:c1', "v2", 101
>> hbase(main):121:0> delete 'usertable', "key1", 'cf:c1', 100
>> hbase(main):122:0> get 'usertable', 'key1', {COLUMN => 'cf:c1', TIMESTAMP
>> => 99, VERSIONS => 4}
>> COLUMN                CELL
>> 
>> 0 row(s) in 0.0040 seconds
>> 
>> hbase(main):123:0> get 'usertable', 'key1', {COLUMN => 'cf:c1', TIMESTAMP
>> => 100, VERSIONS => 4}
>> COLUMN                CELL
>> 
>> 0 row(s) in 0.0050 seconds
>> 
>> hbase(main):124:0> get 'usertable', 'key1', {COLUMN => 'cf:c1', TIMESTAMP
>> => 101, VERSIONS => 4}
>> COLUMN                CELL
>> 
>> cf:c1                timestamp=101, value=v2
>> 
>> 1 row(s) in 0.0050 seconds
>> 
>> Note this is a new feature in 0.94.2
>> (HBASE-4536<https://issues.apache.org/jira/browse/HBASE-4536>),
>> I did not find too many sample code online, so... any one here has
>> experience in using HBASE-4536. How should one config
>> hbase to enable this feature in hbase?
>> 
>> Thanks
>> Yun
> 


Mime
View raw message