hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From yonghu <yongyong...@gmail.com>
Subject Re: What happened in hlog if data are deleted cuased by ttl?
Date Wed, 22 Aug 2012 08:35:07 GMT
And also an interesting point is that the ttl data will not exist in
hfile. I have made the following test,

hbase(main):003:0> create 'test',{TTL=>'200',NAME=>'course'}
0 row(s) in 1.1420 seconds

hbase(main):005:0> put 'test','tom','course:english',90
0 row(s) in 0.0320 seconds

hbase(main):006:0> flush 'test'
0 row(s) in 0.1680 seconds

hbase(main):007:0> scan 'test'
ROW                   COLUMN+CELL
 tom                  column=course:english, timestamp=1345623867082, value=90
1 row(s) in 0.0350 seconds

./hbase org.apache.hadoop.hbase.io.hfile.HFile -v -f
Scanning -> /hbase/test/abe4d5adaa650cdd46d26dca0bf85b72/course/8c77fb321f934592869f9852f777b22e
12/08/22 10:27:39 INFO hfile.CacheConfig: Allocating LruBlockCache
with maximum size 247.9m
Scanned kv count -> 1

so, I guess the ttl data is only managed in memstore. But the question
is that if memstore doesn't have enough size to accept new incoming
ttl data what will happen? Can anybody explain?


On Wed, Aug 22, 2012 at 10:19 AM, yonghu <yongyong313@gmail.com> wrote:
> I can fully understand normal deletion. But, in my point of view, ttl
> deletion is different than the normal deletion. The insertion of ttl
> data is recorded in hlog. But the ttl deletion is not recorded by
> hlog. So, it failure occurs, should the ttl data be reinserted to data
> or can we discard the certain ttl data? Moreover, ttl deletion is not
> executed at data compaction time. Scanner needs to periodically scan
> each Store file to execute deletion.
> regards!
> Yong
> On Tue, Aug 21, 2012 at 5:29 PM, jmozah <jmozah@gmail.com> wrote:
>> This helped me http://hadoop-hbase.blogspot.in/2011/12/deletion-in-hbase.html
>> ./Zahoor
>> HBase Musings
>> On 14-Aug-2012, at 6:54 PM, Harsh J <harsh@cloudera.com> wrote:
>>> Hi Yonghu,
>>> A timestamp is stored along with each insert. The ttl is maintained at
>>> the region-store level. Hence, when the log replays, all entries with
>>> expired TTLs are automatically omitted.
>>> Also, TTL deletions happen during compactions, and hence do not
>>> carry/need Delete events. When scanning a store file, TTL-expired
>>> entries are automatically skipped away.
>>> On Tue, Aug 14, 2012 at 3:34 PM, yonghu <yongyong313@gmail.com> wrote:
>>>> My hbase version is 0.92. I tried something as follows:
>>>> 1.Created a table 'test' with 'course' in which ttl=5.
>>>> 2. inserted one row into the table. 5 seconds later, the row was deleted.
>>>> Later when I checked the log infor of 'test' table, I only found the
>>>> inserted information but not deleted information.
>>>> Can anyone tell me which information is written into hlog when data is
>>>> deleted by ttl or in this situation, no information is written into
>>>> the hlog. If there is no information of deletion in the log, how can
>>>> we guarantee the data recovered by log are correct?
>>>> Thanks!
>>>> Yong
>>> --
>>> Harsh J

View raw message