hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7276) Limit the number of byte arrays used by DFSOutputStream
Date Thu, 30 Oct 2014 20:17:34 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14190742#comment-14190742
] 

Colin Patrick McCabe commented on HDFS-7276:
--------------------------------------------

So, if we're managing powers of 2, we should have 16 managers at most for arrays up to 2**16.
 Actually less than that if we decide not to handle degenerate cases like n = 1, n =2 , n
=4... we can simply make 8 or 16 the smallest array we give out... I don't even think Packet
objects can be that small once you include the header, although I haven't checked the exact
minimum.

Can't we simply create those 12 or so managers immediately, and avoid all the "create a new
manager if statistics say so" logic?

{code}
174	        wait(1000);
{code}
Why does this wait have a timeout?  There isn't anything that we're polling here.

{code}
...    synchronized int recycle(byte[] array) {
...
	        notifyAll();
{code}
Shouldn't this simply be "notify"?  There is no point in waking up all waiters, because only
one of them is going to get the buffer we're recycling.

> Limit the number of byte arrays used by DFSOutputStream
> -------------------------------------------------------
>
>                 Key: HDFS-7276
>                 URL: https://issues.apache.org/jira/browse/HDFS-7276
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs-client
>            Reporter: Tsz Wo Nicholas Sze
>            Assignee: Tsz Wo Nicholas Sze
>         Attachments: h7276_20141021.patch, h7276_20141022.patch, h7276_20141023.patch,
h7276_20141024.patch, h7276_20141027.patch, h7276_20141027b.patch, h7276_20141028.patch, h7276_20141029.patch,
h7276_20141029b.patch
>
>
> When there are a lot of DFSOutputStream's writing concurrently, the number of outstanding
packets could be large.  The byte arrays created by those packets could occupy a lot of memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message