hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Suresh Srinivas (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-5788) listLocatedStatus response can be very large
Date Thu, 16 Jan 2014 22:43:22 GMT

    [ https://issues.apache.org/jira/browse/HDFS-5788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13874082#comment-13874082

Suresh Srinivas commented on HDFS-5788:

bq. Then due to lack of flow control in the RPC layer we can fill up the heap with these given
a large enough average response buffer per call and enough clients.
[~jlowe], thanks for the pointer.

We can certainly reduce the number of files returned in each iteration. But it would increase
the number of requests to be processed by NameNode though.

> listLocatedStatus response can be very large
> --------------------------------------------
>                 Key: HDFS-5788
>                 URL: https://issues.apache.org/jira/browse/HDFS-5788
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>    Affects Versions: 3.0.0, 0.23.10, 2.2.0
>            Reporter: Nathan Roberts
>            Assignee: Nathan Roberts
> Currently we limit the size of listStatus requests to a default of 1000 entries. This
works fine except in the case of listLocatedStatus where the location information can be quite
large. As an example, a directory with 7000 entries, 4 blocks each, 3 way replication - a
listLocatedStatus response is over 1MB. This can chew up very large amounts of memory in the
NN if lots of clients try to do this simultaneously.
> Seems like it would be better if we also considered the amount of location information
being returned when deciding how many files to return.
> Patch will follow shortly.

This message was sent by Atlassian JIRA

View raw message