hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jim Kellerman (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-69) [hbase] Make cache flush triggering less simplistic
Date Tue, 12 Feb 2008 18:23:08 GMT

    [ https://issues.apache.org/jira/browse/HBASE-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12568270#action_12568270
] 

Jim Kellerman commented on HBASE-69:
------------------------------------

> Billy Pearson - 12/Feb/08 08:32 AM
> sorry my last post I am talking about while the cluster is busy. The hlogs have >
4 hours between them
> that would mean that I am rolling the hlog many times after several optional memcash
flushes while
> jobs are running. 

If the hlogs have > 4 hours between them then you can only expect a garbage collection
once every
4 hours.

> Might check to make sure the option flush is updating the sequence id even if there
> is 0 entries and the flush is skipped. The problem could be from the code that removed
the code before
> I used to see 0 logs removed while having debug turned on and hlog rolling but I do not
see these
>  messages after this patches.

I can assure you that the sequence id is getting updated and flushed to the log with an optional
cache
fllush even with no entries.

In this patch and in trunk, optional flushes are put on the flush queue just like requested
flushes.
(See HRegionServer$Flusher.run)

When their queue entry triggers,
- in trunk: HRegion.flushCache() is called.
- in the patch HRegion.flushCache(HStore) is called

They both end up in HRegion.internalFlushcache, which:
- first obtains a sequence Id for the log.
- calls HStore.flushCache(sequenceId)

Even if the cache is not flushed, in HRegion.internalFlushcache, both trunk and the patch
call
HLog.completeCacheFlush which writes the new sequence id (for the region in the case of trunk
or for the store in the case of the patch).

However no log files are removed until the current log is rolled (closed and a new one opened).


> [hbase] Make cache flush triggering less simplistic
> ---------------------------------------------------
>
>                 Key: HBASE-69
>                 URL: https://issues.apache.org/jira/browse/HBASE-69
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: stack
>            Assignee: Jim Kellerman
>             Fix For: 0.2.0
>
>         Attachments: patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt,
patch.txt, patch.txt, patch.txt, patch.txt, patch.txt, patch.txt
>
>
> When flusher runs -- its triggered when the sum of all Stores in a Region > a configurable
max size -- we flush all Stores though a Store memcache might have but a few bytes.
> I would think Stores should only dump their memcache disk if they have some substance.
> The problem becomes more acute, the more families you have in a Region.
> Possible behaviors would be to dump the biggest Store only, or only those Stores >
50% of max memcache size.  Behavior would vary dependent on the prompt that provoked the flush.
 Would also log why the flush is running: optional or > max size.
> This issue comes out of HADOOP-2621.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message