hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7276) Limit the number of byte arrays used by DFSOutputStream
Date Thu, 30 Oct 2014 20:06:34 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14190727#comment-14190727

Colin Patrick McCabe commented on HDFS-7276:

bq. The manager is going to be shared by all DFSOutputStreams for any client and any user.
How could it be put into ClientContext?

Have DFSOutputStream.java call dfsClient.getClientContext().getByteArrayManager() to get the
relevant byte array manager.  This call can be in the constructor.

ClientContext was created because having caches that were static (i.e. global to the JVM)
was very inflexible.  Global caches limit the flexibility of applications.  It also can make
it difficult to unit test, since you're always dealing with the same global cache and different
tests will reuse that same cache.  Meanwhile, we were unable to put caching into DFSClient
itself, since FileContext creates many DFSClient instances over time, not just one like DistributedFileSystem
does.  ClientContext is a good solution because it allows users to opt into a different context
if they want, but to get the default context if not.  It works with both FileContext and FileSystem.

> Limit the number of byte arrays used by DFSOutputStream
> -------------------------------------------------------
>                 Key: HDFS-7276
>                 URL: https://issues.apache.org/jira/browse/HDFS-7276
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs-client
>            Reporter: Tsz Wo Nicholas Sze
>            Assignee: Tsz Wo Nicholas Sze
>         Attachments: h7276_20141021.patch, h7276_20141022.patch, h7276_20141023.patch,
h7276_20141024.patch, h7276_20141027.patch, h7276_20141027b.patch, h7276_20141028.patch, h7276_20141029.patch,
> When there are a lot of DFSOutputStream's writing concurrently, the number of outstanding
packets could be large.  The byte arrays created by those packets could occupy a lot of memory.

This message was sent by Atlassian JIRA

View raw message