hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amir Langer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6659) Create a Block List
Date Thu, 04 Sep 2014 13:34:52 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14121339#comment-14121339

Amir Langer commented on HDFS-6659:

Hi [~vinayrpet]

1. It will be used by the patch I'm preparing now for the third subtask (HDFS 6661) 
2. Yes you are right. As [~nroberts] pointed out earlier on the umbrella JIRA (HDFS-6658),

the case of many blocks deleted followed by no new blocks at all for a long duration will
leave gaps that need to be cleaned. 
We haven't addressed this cleanup yet as we see it as an edge case to be dealt with separately.

We left it as a future JIRA where the question of the cost of this cleanup and how to approach
it needs to be decided.

> Create a Block List
> -------------------
>                 Key: HDFS-6659
>                 URL: https://issues.apache.org/jira/browse/HDFS-6659
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: namenode
>    Affects Versions: 2.4.1
>            Reporter: Amir Langer
>            Assignee: Amir Langer
>              Labels: perfomance
>         Attachments: HDFS-6659.patch
> BlockList - An efficient array based list that can extend its capacity with two main
> 1. Gaps (result of remove operations) are managed internally without the need for extra
memory - We create a linked list of gaps by using the array index as references + An int to
the head of the gaps list. In every insert operation, we first use any available gap before
extending the array.
> 2. Array extension is done by chaining different arrays, not by allocating a larger array
and copying all its data across. This is a lot less heavy in terms of latency for that particular
call. It also avoids having large amount of contiguous heap space and so behaves nicer with
garbage collection.

This message was sent by Atlassian JIRA

View raw message