Return-Path: X-Original-To: apmail-hadoop-hdfs-dev-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 32B39103FF for ; Thu, 16 Jan 2014 15:29:28 +0000 (UTC) Received: (qmail 62505 invoked by uid 500); 16 Jan 2014 15:29:25 -0000 Delivered-To: apmail-hadoop-hdfs-dev-archive@hadoop.apache.org Received: (qmail 61991 invoked by uid 500); 16 Jan 2014 15:29:21 -0000 Mailing-List: contact hdfs-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-dev@hadoop.apache.org Delivered-To: mailing list hdfs-dev@hadoop.apache.org Received: (qmail 61973 invoked by uid 99); 16 Jan 2014 15:29:19 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 16 Jan 2014 15:29:19 +0000 Date: Thu, 16 Jan 2014 15:29:19 +0000 (UTC) From: "Nathan Roberts (JIRA)" To: hdfs-dev@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (HDFS-5788) listLocatedStatus response can be very large MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Nathan Roberts created HDFS-5788: ------------------------------------ Summary: listLocatedStatus response can be very large Key: HDFS-5788 URL: https://issues.apache.org/jira/browse/HDFS-5788 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Affects Versions: 2.2.0, 0.23.10, 3.0.0 Reporter: Nathan Roberts Assignee: Nathan Roberts Currently we limit the size of listStatus requests to a default of 1000 entries. This works fine except in the case of listLocatedStatus where the location information can be quite large. As an example, a directory with 7000 entries, 4 blocks each, 3 way replication - a listLocatedStatus response is over 1MB. This can chew up very large amounts of memory in the NN if lots of clients try to do this simultaneously. Seems like it would be better if we also considered the amount of location information being returned when deciding how many files to return. Patch will follow shortly. -- This message was sent by Atlassian JIRA (v6.1.5#6160)