hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ted Yu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-15181) A simple implementation of date based tiered compaction
Date Thu, 04 Feb 2016 03:45:39 GMT

    [ https://issues.apache.org/jira/browse/HBASE-15181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15131670#comment-15131670
] 

Ted Yu commented on HBASE-15181:
--------------------------------

w.r.t. the nested loop in getBuckets(), do you think the following is more readable ?
{code}
    while (it.hasNext()) {
      if (!target.onTarget(it.peek().getSecond())) {
        // If the file is too new for the target, skip it.
        if (target.compareToTimestamp(it.peek().getSecond()) < 0) {
          it.next();
        } else {
          // If the file is too old for the target, switch to higher
          // tier.
          target = target.nextTarget(tierBase);
        }
      }

      ArrayList<T> bucket = Lists.newArrayList();
      // Add all files of the same tier to current bucket
      while (it.hasNext() && target.onTarget(it.peek().getSecond())) {
        bucket.add(it.next().getFirst());
      }
      if (!bucket.isEmpty()) {
        buckets.add(bucket);
      }
    }
{code}
Basically there is no need for the label.
I ran TestTieredCompaction and it passed.

> A simple implementation of date based tiered compaction
> -------------------------------------------------------
>
>                 Key: HBASE-15181
>                 URL: https://issues.apache.org/jira/browse/HBASE-15181
>             Project: HBase
>          Issue Type: New Feature
>          Components: Compaction
>            Reporter: Clara Xiong
>            Assignee: Clara Xiong
>             Fix For: 2.0.0
>
>         Attachments: HBASE-15181-v1.patch, HBASE-15181-v2.patch
>
>
> This is a simple implementation of date-based tiered compaction similar to Cassandra's
for the following benefits:
> 1. Improve date-range-based scan by structuring store files in date-based tiered layout.
> 2. Reduce compaction overhead.
> 3. Improve TTL efficiency.
> Perfect fit for the use cases that:
> 1. has mostly date-based date write and scan and a focus on the most recent data. 
> 2. never or rarely deletes data.
> Out-of-order writes are handled gracefully so the data will still get to the right store
file for time-range-scan and re-compacton with existing store file in the same time window
is handled by ExploringCompactionPolicy.
> Time range overlapping among store files is tolerated and the performance impact is minimized.
> Configuration can be set at hbase-site or overriden at per-table or per-column-famly
level by hbase shell.
> Design spec is at https://docs.google.com/document/d/1_AmlNb2N8Us1xICsTeGDLKIqL6T-oHoRLZ323MG_uy8/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message