hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andy Isaacson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3828) Block Scanner rescans blocks too frequently
Date Thu, 23 Aug 2012 00:27:42 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13439972#comment-13439972

Andy Isaacson commented on HDFS-3828:

bq. If the scanner scans exactly once shouldn't scansLastRun be 0 after this first run? Ie
getBlocksScannedInLastRun shouldn't always return 1 right?

Empirically it is always 1 after a block has been scanned.  This is because when we call scanBlockPoolSlice
but there is nothing to scan we're doing a bunch of useless work:
# creating a new HashMap {{processedBlocks}}
# parsing the verificationLogs and putting the results in the new {{processedBlocks}}
# calling scan() which returns immediately
# setting totalBlocksScannedInLastRun to the resulting size of {{processedBlocks}}

bq. Like the new approach better.

I also like the new code better, but the fact that we can't shortcircuit all the nonsense
enumerated above in {{scanBlockPoolSlice}} is a bummer.  The previous approach avoided doing
all of this extra work.

As an alternative, we could propagate a "please wake me up at time T" up from BlockPoolSliceScanner
to DataBlockScanner#run and adjust the sleep time there, accordingly.  If all threadpools
continue to have work to do, then preserve the existing 5-second sleep; if all threadpools
are done working then DataBlockScanner could go to sleep for much longer.
> Block Scanner rescans blocks too frequently
> -------------------------------------------
>                 Key: HDFS-3828
>                 URL: https://issues.apache.org/jira/browse/HDFS-3828
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 0.23.0, 2.0.0-alpha
>            Reporter: Andy Isaacson
>            Assignee: Andy Isaacson
>         Attachments: hdfs-3828-1.txt, hdfs3828.txt
> {{BlockPoolSliceScanner#scan}} calls cleanUp every time it's invoked from {{DataBlockScanner#run}}
via {{scanBlockPoolSlice}}.  But cleanUp unconditionally roll()s the verificationLogs, so
after two iterations we have lost the first iteration of block verification times.  As a result
a cluster with just one block repeatedly rescans it every 10 seconds:
> {noformat}
> 2012-08-16 15:59:57,884 INFO  datanode.BlockPoolSliceScanner (BlockPoolSliceScanner.java:verifyBlock(391))
- Verification succeeded for BP-2101131164-
> 2012-08-16 16:00:07,904 INFO  datanode.BlockPoolSliceScanner (BlockPoolSliceScanner.java:verifyBlock(391))
- Verification succeeded for BP-2101131164-
> 2012-08-16 16:00:17,925 INFO  datanode.BlockPoolSliceScanner (BlockPoolSliceScanner.java:verifyBlock(391))
- Verification succeeded for BP-2101131164-
> {noformat}
> {quote}
> To fix this, we need to avoid roll()ing the logs multiple times per period.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message