hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jim Kellerman (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-69) [hbase] Make cache flush triggering less simplistic
Date Tue, 12 Feb 2008 05:46:08 GMT

    [ https://issues.apache.org/jira/browse/HBASE-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12567968#action_12567968
] 

Jim Kellerman commented on HBASE-69:
------------------------------------

> Billy Pearson - 11/Feb/08 08:59 PM
> I am still seeing a hlog build up problem
>
> example I see these a lot after a job is done and the server are idle
>
> 2008-02-11 22:46:44,032 INFO org.apache.hadoop.hbase.HStore: Not flushing cache for
> 519281761/anchor because it has 0 entries
> 2008-02-11 22:46:44,032 DEBUG org.apache.hadoop.hbase.HRegion: Finished memcache
>  flush for store 519281761/anchor in 1ms, sequenceid=131598230
> 
> I assume this is a optional flush which is good to have but if there is no entries can
we update the
> columns current sequence id for that column to our current max sequence id so we can
remove the
> old logs after the next hlog ?

Yes, this is an optional cache flush. The column's max sequence id is updated after a cache
flush:

{code}
    long sequenceId = log.startCacheFlush();
...
    this.log.completeCacheFlush(store.storeName, getTableDesc().getName(),
        sequenceId);
{code}

> What I am seeing is low to no updated columns never get a memcache flush so there sequence
id
> never changes (unless there is a split) and the old hlogs never get removed.

Low to no updated columns will only get an optional cache flush, which will set their sequence
number. However log files are not garbage collected until the log fills up and is rolled,
which
is not happening if the region server is idle. The only other time log files are cleaned up
is when
the region server shuts down.

I have created issue HBASE-440 to add optional log rolling so that idle region servers will
garbage collect old log files.



> [hbase] Make cache flush triggering less simplistic
> ---------------------------------------------------
>
>                 Key: HBASE-69
>                 URL: https://issues.apache.org/jira/browse/HBASE-69
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: stack
>            Assignee: Jim Kellerman
>             Fix For: 0.2.0
>
>         Attachments: patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt,
patch.txt, patch.txt, patch.txt, patch.txt, patch.txt
>
>
> When flusher runs -- its triggered when the sum of all Stores in a Region > a configurable
max size -- we flush all Stores though a Store memcache might have but a few bytes.
> I would think Stores should only dump their memcache disk if they have some substance.
> The problem becomes more acute, the more families you have in a Region.
> Possible behaviors would be to dump the biggest Store only, or only those Stores >
50% of max memcache size.  Behavior would vary dependent on the prompt that provoked the flush.
 Would also log why the flush is running: optional or > max size.
> This issue comes out of HADOOP-2621.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message