hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7836) BlockManager Scalability Improvements
Date Thu, 30 Apr 2015 23:36:09 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14522480#comment-14522480
] 

Colin Patrick McCabe commented on HDFS-7836:
--------------------------------------------

Hi [~xinwei],

The discussion on March 11th was focused on our proposal for off-heaping and parallelizing
the block manager from February 24th.  We spent a lot of time going through the proposal and
responding to questions on the proposal.

There was widespread agreement that we needed to reduce the garbage collection impact of the
millions of BlockInfoContiguous structures.  There was some disagreement about how to do that.
 Daryn argued that using large primitive arrays was the best way to go.  Charles and I argued
that using off-heap storage was better.

The main advantage of large primitive arrays is that it makes the existing Java \-Xmx memory
settings work as expected.  The main advantage of off-heap is that it allows the use of things
like {{Unsafe#compareAndSwap}}, which can often lead to more efficient concurrent data structures.
 Also, when using off-heap memory, we get to re-use malloc rather than essentially writing
our own malloc for every subsystem.

There was some hand-wringing about off-heap memory being slower, but I do not believe that
this is valid.  Apache Spark has found that their off-heap hash table was actually faster
than the on-heap one, due to the ability to better control the memory layout.  https://databricks.com/blog/2015/04/28/project-tungsten-bringing-spark-closer-to-bare-metal.html
 The key is to avoid using {{DirectByteBuffer}}, which is rather slow, and use {{Unsafe}}
instead.

However, Daryn has posted some patches using the "large arrays" approach.  Since they are
a nice incremental improvement, we are probably going to pick them up if there are no blockers.
 We are also looking at incremental improvements such as implementing backpressure for full
block reports, and speeding up edit log replay (if possible).  I would also like to look at
parallelizing the full block report... if we can do that, we can get a dramatic improvement
in FBR times by using more than 1 core.

> BlockManager Scalability Improvements
> -------------------------------------
>
>                 Key: HDFS-7836
>                 URL: https://issues.apache.org/jira/browse/HDFS-7836
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Charles Lamb
>            Assignee: Charles Lamb
>         Attachments: BlockManagerScalabilityImprovementsDesign.pdf
>
>
> Improvements to BlockManager scalability.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message