hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinayakumar B (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6114) Block Scan log rolling will never happen if blocks written continuously leading to huge size of dncp_block_verification.log.curr
Date Tue, 15 Jul 2014 09:49:05 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14061895#comment-14061895
] 

Vinayakumar B commented on HDFS-6114:
-------------------------------------

bq. I don't really see a good reason to separate delBlockInfo and delNewBlockInfo. It seems
like this could just lead to scenarios where we think we're deleting a block but it pops back
up (because we deleted, but did not delete new)
Here, both are working on different set. {{delBlockInfo}} is used in someother places as well
while updating the scantime and resort the blockInfoSet.
{{delNewBlockInfo}} is only needs to be called while deleting the block itself, as intermediate
updates will not happen on this set data.
So {{delBlockInfo}} and {{delNewBlockInfo}} serves separate purposes and both are required.

bq. I guess maybe it makes sense to separate addBlockInfo from addNewBlockInfo, just because
there are places in the setup code where we're willing to add stuff directly to blockInfoSet.
Even in that case, I would argue it might be easier to call addNewBlockInfo and then later
roll all the newBlockInfoSet items into blockInfoSet. The problem is that having both functions
creates confusion and increase the chance that someone will add an incorrect call to the wrong
one later on in another change.
As I am seeing, both these methods are private and acts on different sets. since method name
itself suggests {{addNewBlockInfo}} is only for the new blocks. I am not seeing any confusion
here.

bq. It seems like a bad idea to use BlockScanInfo.LAST_SCAN_TIME_COMPARATOR for blockInfoSet,
but BlockScanInfo#hashCode (i.e. the HashSet strategy) for newBlockInfoSet. Let's just use
a SortedSet for both so we don't have to ponder any possible discrepancies between the comparator
and the hash function.
{{blockInfoSet}} is required to be sorted based on the lastScanTime, as oldest scanned block
will be picked for scanning, which will be the first element in this set always. BlockScanInfo.LAST_SCAN_TIME_COMPARATOR
is used because {{BlockScanInfo#hashCode()}} is default which will sort based on the blockId
rather than scan time. 
Do you suggest me to update this {{hashCode()}} itself?

bq. Another problem with HashSet (compared with TreeSet) is that it never shrinks down after
enlarging... a bad property for a temporary holding area
Yes, this I agree, will update in the next patch.

> Block Scan log rolling will never happen if blocks written continuously leading to huge
size of dncp_block_verification.log.curr
> --------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-6114
>                 URL: https://issues.apache.org/jira/browse/HDFS-6114
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode
>    Affects Versions: 2.3.0, 2.4.0
>            Reporter: Vinayakumar B
>            Assignee: Vinayakumar B
>            Priority: Critical
>         Attachments: HDFS-6114.patch, HDFS-6114.patch
>
>
> 1. {{BlockPoolSliceScanner#scan()}} will not return until all the blocks are scanned.

> 2. If the blocks (with size in several MBs) to datanode are written continuously 
> then one iteration of {{BlockPoolSliceScanner#scan()}} will be continously scanning the
blocks
> 3. These blocks will be deleted after some time (enough to get block scanned)
> 4. As Block Scanning is throttled, So verification of all blocks will take so much time.
> 5. Rolling will never happen, so even though the total number of blocks in datanode doesn't
increases, entries ( which contains stale entries of deleted blocks) in *dncp_block_verification.log.curr*
continuously increases leading to huge size.
> In one of our env, it grown more than 1TB where total number of blocks were only ~45k.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message