hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Gray (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-3083) Major compaction check should use new timestamp meta information in HFiles (rather than dfs timestamp) along with TTL to allow major even if single file
Date Wed, 20 Oct 2010 23:42:23 GMT

    [ https://issues.apache.org/jira/browse/HBASE-3083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12923236#action_12923236
] 

Jonathan Gray commented on HBASE-3083:
--------------------------------------

Nope.  That jira is about the first check (was the last major compaction > the configured
period).

This jira is about the second check.  Once we determine we want to do a major, we do another
check.  If there is only one file, and that file is a major compaction, we generally want
to skip the major.  _Except_ when there is expired data in that file.

This check is in there but is currently using the hdfs timestamp of this single file which
is not the right timestamp to use.  We want to use the minimum timestamp from the TimestampRange
we're now tracking in hfile metadata to accurately know whether there is expired data in there
or not (and thus should go ahead with major compaction).

> Major compaction check should use new timestamp meta information in HFiles (rather than
dfs timestamp) along with TTL to allow major even if single file
> --------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-3083
>                 URL: https://issues.apache.org/jira/browse/HBASE-3083
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Jonathan Gray
>            Assignee: Prakash Khemani
>             Fix For: 0.90.0
>
>
> Periodic major compactions have a separate set of checks prior to submitting the compaction
request.
> Currently, if there is a single file, and it is the result of a major compaction, then
it is skipped.  However, there is a check that will still allow it if the timestamp of the
file is older than the TTL of that Store.
> This is not ideal because the timestamp of the file is the latest timestamp in the file
rather than the oldest.  Meta information was introduced to HFiles that stores the max/min
timestamp of KVs in the file.  We should use the min timestamp from that meta info rather
than the file stamp itself.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message