hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Billy Pearson (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-69) [hbase] Make cache flush triggering less simplistic
Date Mon, 04 Feb 2008 23:15:08 GMT

    [ https://issues.apache.org/jira/browse/HBASE-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12565594#action_12565594

Billy Pearson commented on HBASE-69:

Currently flushes and compaction's work good from what I can tell on my setup and test.

There are two area I have concerns in and have not got a chance to test.

1. hlogs: If I have a column family that does not get updates but say 1 out 100-250 updates
then is that region going to hold up the removal of old hlogs waiting for a flush from this
column. If this is so on column family could make a recovery take a long time if the region
server fails. This is one of the reason besides memory usage I thank we need to leave/add
back the option flusher to flush every 30-60 mins.

2. Splits: If I have a large region split in to two the compaction starts on the reload of
the new splits. But say the columns take 50 mins to compact in that 50 min. If I get updates
to cause a split again will this fail if the region has not finished compacting all the regions
reference files from the original split.

Out side of the above concerns I have not noticed any bugs in the patch while flushing or
compaction's all seams ok in that area.

> [hbase] Make cache flush triggering less simplistic
> ---------------------------------------------------
>                 Key: HBASE-69
>                 URL: https://issues.apache.org/jira/browse/HBASE-69
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: stack
>            Assignee: Jim Kellerman
>         Attachments: patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt,
patch.txt, patch.txt, patch.txt
> When flusher runs -- its triggered when the sum of all Stores in a Region > a configurable
max size -- we flush all Stores though a Store memcache might have but a few bytes.
> I would think Stores should only dump their memcache disk if they have some substance.
> The problem becomes more acute, the more families you have in a Region.
> Possible behaviors would be to dump the biggest Store only, or only those Stores >
50% of max memcache size.  Behavior would vary dependent on the prompt that provoked the flush.
 Would also log why the flush is running: optional or > max size.
> This issue comes out of HADOOP-2621.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message