Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hdfs-issues@hadoop.apache.org
Date: Wed, 21 Jan 2015 23:05:36 +0000 (UTC)
From: "Colin Patrick McCabe (JIRA)" <jira@apache.org>
To: hdfs-issues@hadoop.apache.org
Message-ID: <JIRA.12757328.1416803753000.139569.1421881536436@Atlassian.JIRA>
In-Reply-To: <JIRA.12757328.1416803753000@Atlassian.JIRA>
References: <JIRA.12757328.1416803753000@Atlassian.JIRA>
 <JIRA.12757328.1416803753163@arcas>
Subject: [jira] [Commented] (HDFS-7430) Refactor the BlockScanner to use
 O(1) memory and use multiple threads
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/HDFS-7430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14286495#comment-14286495 ] 

Colin Patrick McCabe commented on HDFS-7430:
--------------------------------------------

It is fair to call this a rewrite of major parts of the block scanner.

I don't think it makes sense to maintain two block scanners in parallel.  There would have to be a lot of glue code and extra interfaces to get both working.  Let's let this soak in trunk for a while and then merge to branch-2 when it is stabilized, the same as we did with other things such as truncate.

> Refactor the BlockScanner to use O(1) memory and use multiple threads
> ---------------------------------------------------------------------
>
>                 Key: HDFS-7430
>                 URL: https://issues.apache.org/jira/browse/HDFS-7430
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>    Affects Versions: 2.7.0
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>         Attachments: HDFS-7430.002.patch, HDFS-7430.003.patch, HDFS-7430.004.patch, HDFS-7430.005.patch, HDFS-7430.006.patch, HDFS-7430.007.patch, HDFS-7430.008.patch, HDFS-7430.009.patch, HDFS-7430.010.patch, HDFS-7430.011.patch, HDFS-7430.012.patch, memory.png
>
>
> We should update the BlockScanner to use a constant amount of memory by keeping track of what block was scanned last, rather than by tracking the scan status of all blocks in memory.  Also, instead of having just one thread, we should have a verification thread per hard disk (or other volume), scanning at a configurable rate of bytes per second.


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)