Return-Path: X-Original-To: apmail-hadoop-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B88E0F79A for ; Sat, 23 Mar 2013 05:15:50 +0000 (UTC) Received: (qmail 86662 invoked by uid 500); 23 Mar 2013 05:15:46 -0000 Delivered-To: apmail-hadoop-user-archive@hadoop.apache.org Received: (qmail 86458 invoked by uid 500); 23 Mar 2013 05:15:45 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 86435 invoked by uid 99); 23 Mar 2013 05:15:44 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 23 Mar 2013 05:15:44 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of harsh@cloudera.com designates 209.85.210.182 as permitted sender) Received: from [209.85.210.182] (HELO mail-ia0-f182.google.com) (209.85.210.182) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 23 Mar 2013 05:15:38 +0000 Received: by mail-ia0-f182.google.com with SMTP id u8so4140935iag.13 for ; Fri, 22 Mar 2013 22:15:17 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:mime-version:in-reply-to:references:from:date:message-id :subject:to:content-type:x-gm-message-state; bh=J8O0UZudkc6JW/RdDNLp2DDZv1yol9/mCwyICc6SJWU=; b=chl5iGYP/nmvCVGets7w8JtG4l+4rZ/Jn0VQN5XIURHnFmJ9Kv/MYXv1dle+a6fySb tNL1AfogAWhbempIczZm6EzoOU/IoXVsOqa2IwBOa9EPTHacdq1NDOD09khQV0AnGSVF zAYp3Zb2qZaOpsQFLrkJcHUWNlONPU+XnsmyoBl03RsLh7q9KUO0f+azvl/MRNhkUa5I xveHJncrEJyX2hHcJeil6rmn9ZC26VKUT5VWCSKRoZfH1MOt34lXIUsOJLQQ9ntoylwn MUdZ6BuGpm4fmHu9nCRlHgpEEO6XudD/UDfgKjAVaGkREYIHx2v/jpPoR7CpA/0CGGfc lF9A== X-Received: by 10.50.135.100 with SMTP id pr4mr6280658igb.37.1364015717109; Fri, 22 Mar 2013 22:15:17 -0700 (PDT) MIME-Version: 1.0 Received: by 10.50.181.198 with HTTP; Fri, 22 Mar 2013 22:14:57 -0700 (PDT) In-Reply-To: References: From: Harsh J Date: Sat, 23 Mar 2013 10:44:57 +0530 Message-ID: Subject: Re: how to control (or understand) the memory usage in hdfs To: "" Content-Type: text/plain; charset=ISO-8859-1 X-Gm-Message-State: ALoCoQmkp4qn5poV8UniJpoM1w66/JEl2qK8PaC7FXv/HjL0QVQXlJu5crrwr/cgrApsKJKq6uNc X-Virus-Checked: Checked by ClamAV on apache.org I run a 128 MB heap size DN for my simple purposes on my Mac and it runs well for what load I apply on it. A DN's primary, growing memory consumption comes from the # of blocks it carries. All of these blocks' file paths are mapped and kept in the RAM during its lifetime. If your DN has acquired a lot of blocks by now, like say close to a million or more, then 1 GB may not suffice anymore to hold them in and you'd need to scale up (add more RAM or increase heap size if you have more RAM)/scale out (add another node and run the balancer). On Sat, Mar 23, 2013 at 10:03 AM, Ted wrote: > Hi I'm new to hadoop/hdfs and I'm just running some tests on my local > machines in a single node setup. I'm encountering out of memory errors > on the jvm running my data node. > > I'm pretty sure I can just increase the heap size to fix the errors, > but my question is about how memory is actually used. > > As an example, with other things like an OS's disk-cache or say > databases, if you have or let it use as an example 1gb of ram, it will > "work" with what it has available, if the data is more than 1gb of ram > it just means it'll swap in and out of memory/disk more often, i.e. > the cached data is smaller. If you give it 8gb of ram it still > functions the same, just performance increases. > > With my hdfs setup, this does not appear to be true, if I allocate it > 1gb of heap, it doesn't just perform worst / swap data to disk more. > It out right fails with out of memory and shuts the data node down. > > So my question is... how do I really tune the memory / decide how much > memory I need to prevent shutdowns? Is 1gb just too small even on a > single machine test environment with almost no data at all, or is it > suppose to work like OS-disk caches were it always works but just > performs better or worst and I just have something configured wrong?. > Basically my objective isn't performance, it's that the server must > not shut itself down, it can slow down but not shut off. > > -- > Ted. -- Harsh J