hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Billy Pearson" <sa...@pearsonwholesale.com>
Subject Re: Deletes in HBase
Date Wed, 20 Aug 2008 18:27:09 GMT
When you delete a cell there is a record inserted with the same timestamp so 
when the compaction happens it will be deleted

When you inset a second value to the same row/column the timestamp should be 
different and not deleted.

>From what I understand we read from the memcache then from newest written 
HStore files until we get what we need to answer the query.
but I could be wrong here


"John Ryan" <john.reliance.ryan@gmail.com> 
wrote in message 
> Ok. Now when I delete a key and then at some later point re-insert the 
> same
> key with a different value would the old values be resuscitated? If not 
> how
> is this enforced? I am looking through the code but like you said it is
> complicated :).
> On Mon, Aug 18, 2008 at 6:02 PM, Jim Kellerman 
> <jim@powerset.com> wrote:
>> Comments inline:
>> > -----Original Message-----
>> > From: John Ryan 
>> > [mailto:john.reliance.ryan@gmail.com]
>> > Sent: Monday, August 18, 2008 4:49 PM
>> > To: hbase-user@hadoop.apache.org
>> > Subject: Deletes in HBase
>> >
>> > How do deletes work in HBase? Suppose I have 2 Column Families and a 
>> > key
>> > that has entries for both column families. Now I want to delete this 
>> > key.
>> You mean you want to delete all the values for that row key?
>> See HTable.deleteAll({byte[]|String)
>> > Is a major compaction absolutely essential for this key to be deleted?
>> No. Essentially a record is written that indicates that a cell, row, or
>> column family has been deleted.
>> > Where I can I follow the code this operation?
>> All the following paths are prefixed with org.apache.hadoop.hbase:
>> client.HTable - eventually creates a Java Proxy which has the api 
>> specified
>> by
>> ipc.HRegionInterface
>> which figures out which region server to send the message to. This call
>> will be answered by regionserver.HRegionServer.deleteAll which calles
>> regionserver.HRegion.deleteAll for the appropriate region, calling
>> HRegion.deleteMultiple, HRegion.update which first appends the change to 
>> the
>> HLog by calling regionserver.HLog.append, and then stores the information 
>> in
>> the HStore(s) for the appropriate families by calling
>> regionserver.HStore.add, which in turn stores it in the memcache for the
>> HStore by calling regionserver.Memcache.add, which calls
>> regionserver.Memcache.add
>> Now the change has been persisted to the redo log (HLog) and is cached.
>> When the cache fills, a cache flush will write the contents of the cache 
>> out
>> to disk and may result in a minor compaction.
>> > Now I am assuming that major compaction doesn't take place all the time
>> > since it may be an expensive operation. Having said that how are the
>> reads
>> > for this key supressed? Please explain.
>> Reads are suppressed at the level of HStore, and Memcache. They come 
>> across
>> the deleted markers and suppress the results that would otherwise have 
>> been
>> returned.
>> You would have to follow the call tree for get, getRow, and the various
>> scanner.next methods to see how this works. It is very complicated.

View raw message