hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: What happened in hlog if data are deleted cuased by ttl?
Date Wed, 22 Aug 2012 08:49:33 GMT
Hey Yonghu,

You are right that TTL "deletions" (it isn't exactly a delete, its
more of a compact-time skip wizardry) do not go to the HLog as
"events". Know that TTLs aren't applied "per-cell", they are applied
on the whole CF globally. So there is no such thing as a TTL-write or
a TTL-delete event. In fact, the Region-level Coprocessor too has no
hooks for "TTL-events", as seen at
for this doesn't happen on triggers.

What you say about the compaction part is wrong however. Compaction
too runs a regular store-file scanner to compact, and so does the
regular Scan operation, to read (Both use the same file scanning
mechanism/code). So there's no difference in how compact or a client
scan handle TTL-expired row values from a store file, when reading it

I also am not able to understand what your sample shell command list
shows. As I see it, its shown that the HFile did have the entry in it
after you had flushed it. Note that you mentioned the TTL at the CF
level when creating the table, not in your "put" statement, and this
is a vital point in understanding how TTLs work.

On Wed, Aug 22, 2012 at 1:49 PM, yonghu <yongyong313@gmail.com> wrote:
> I can fully understand normal deletion. But, in my point of view, ttl
> deletion is different than the normal deletion. The insertion of ttl
> data is recorded in hlog. But the ttl deletion is not recorded by
> hlog. So, it failure occurs, should the ttl data be reinserted to data
> or can we discard the certain ttl data? Moreover, ttl deletion is not
> executed at data compaction time. Scanner needs to periodically scan
> each Store file to execute deletion.
> regards!
> Yong
> On Tue, Aug 21, 2012 at 5:29 PM, jmozah <jmozah@gmail.com> wrote:
>> This helped me http://hadoop-hbase.blogspot.in/2011/12/deletion-in-hbase.html
>> ./Zahoor
>> HBase Musings
>> On 14-Aug-2012, at 6:54 PM, Harsh J <harsh@cloudera.com> wrote:
>>> Hi Yonghu,
>>> A timestamp is stored along with each insert. The ttl is maintained at
>>> the region-store level. Hence, when the log replays, all entries with
>>> expired TTLs are automatically omitted.
>>> Also, TTL deletions happen during compactions, and hence do not
>>> carry/need Delete events. When scanning a store file, TTL-expired
>>> entries are automatically skipped away.
>>> On Tue, Aug 14, 2012 at 3:34 PM, yonghu <yongyong313@gmail.com> wrote:
>>>> My hbase version is 0.92. I tried something as follows:
>>>> 1.Created a table 'test' with 'course' in which ttl=5.
>>>> 2. inserted one row into the table. 5 seconds later, the row was deleted.
>>>> Later when I checked the log infor of 'test' table, I only found the
>>>> inserted information but not deleted information.
>>>> Can anyone tell me which information is written into hlog when data is
>>>> deleted by ttl or in this situation, no information is written into
>>>> the hlog. If there is no information of deletion in the log, how can
>>>> we guarantee the data recovered by log are correct?
>>>> Thanks!
>>>> Yong
>>> --
>>> Harsh J

Harsh J

View raw message