hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Raghu Angadi (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4797) RPC Server can leave a lot of direct buffers
Date Sat, 06 Dec 2008 02:41:46 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12654027#action_12654027
] 

Raghu Angadi commented on HADOOP-4797:
--------------------------------------


JVM :
      -  For NIO sockets, Sun's implementation  uses a internal direct buffer. It keeps up
to 3 such buffers for each thread. It creates a new one each time the existing buffers are
not large enough.

RPC Server :
      -  While sending and receiving serialized data, the handlers invoke read() or write()
with the _entire_  buffer.
      - If there are RPCs that return a lot of data (e.g. listFiles() on a large directory),
it ends up creating large direct buffers
      - in one of the cases, clients listed a large directory (35k files, 6MB serialized data).
             -- in addition the clients increased number of files after such calls.
             -- as result, server ends up creating thousands of 6MB buffers since each time
JVM requires a slightly larger direct buffer.
             -- Full GC might help but not a viable option.
             -- Not sure even after full GC if this memory will be returned back to OS.  

I think fix is fairly straight fwd. RPC server read or write in smaller chunks. for e.g. :

{code}
     // Replace
     nWritten = write(buf, 0, len); 

    // with 
    nWritten = 0;
    while (nWritten < len) {
         int ret = write(buf, nWritten, MIN(len-nWritten, 64KB));
         if (ret <= 0) break;
        //...
    }
{code}


> RPC Server can leave a lot of direct buffers 
> ---------------------------------------------
>
>                 Key: HADOOP-4797
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4797
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: ipc
>    Affects Versions: 0.17.0
>            Reporter: Raghu Angadi
>            Assignee: Raghu Angadi
>
> RPC server unwittingly can soft-leak direct buffers. One observed case is that one of
the namenodes at Yahoo took 40GB of virtual memory though it was configured for 24GB memory.
Most of the memory outside Java heap expected to be direct buffers. This shown to be because
of how RPC server reads and writes serialized data. The cause and proposed fix are in following
comment.
>   

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message