hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-4940) namenode OOMs under Bigtop's TestCLI
Date Fri, 28 Jun 2013 20:22:20 GMT

    [ https://issues.apache.org/jira/browse/HDFS-4940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13695747#comment-13695747
] 

Colin Patrick McCabe commented on HDFS-4940:
--------------------------------------------

so I think the issue here is that the RPC layer reads 4 bytes from the client, and allocates
a buffer of that size, no matter how big it is.

Code here:

{code}
          dataLength = dataLengthBuffer.getInt();
          if ((dataLength == Client.PING_CALL_ID) && (!useWrap)) {
            // covers the !useSasl too
            dataLengthBuffer.clear();
            return 0; // ping message
          }
          
          if (dataLength < 0) {
            LOG.warn("Unexpected data length " + dataLength + "!! from " + 
                getHostAddress());
          }
          data = ByteBuffer.allocate(dataLength);
{code}

It seems silly to allocate such large RPC buffers.  It allows clients to bring down the NN
with just one or two RPCs.

Why don't we make the maximum RPC size configurable, and perhaps default to 64 MB or something?
 Even a block report of a few million blocks should fit in that size.
                
> namenode OOMs under Bigtop's TestCLI
> ------------------------------------
>
>                 Key: HDFS-4940
>                 URL: https://issues.apache.org/jira/browse/HDFS-4940
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.1.0-beta
>            Reporter: Roman Shaposhnik
>            Priority: Blocker
>             Fix For: 2.1.0-beta
>
>
> Bigtop's TestCLI when executed against Hadoop 2.1.0 seems to make it OOM quite reliably
regardless of the heap size settings. I'm attaching a heap dump URL. Alliteratively anybody
can just take Bigtop's tests, compiled them against Hadoop 2.1.0 bits and try to reproduce
it.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message