Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C931DED24 for ; Thu, 14 Feb 2013 18:28:13 +0000 (UTC) Received: (qmail 68539 invoked by uid 500); 14 Feb 2013 18:28:09 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 68255 invoked by uid 500); 14 Feb 2013 18:28:08 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 68248 invoked by uid 99); 14 Feb 2013 18:28:08 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 Feb 2013 18:28:08 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of dontariq@gmail.com designates 209.85.220.171 as permitted sender) Received: from [209.85.220.171] (HELO mail-vc0-f171.google.com) (209.85.220.171) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 Feb 2013 18:28:02 +0000 Received: by mail-vc0-f171.google.com with SMTP id p1so1706778vcq.30 for ; Thu, 14 Feb 2013 10:27:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:mime-version:in-reply-to:references:from:date:message-id :subject:to:content-type; bh=MaDiucBq6rJds2nurZxOe5bwbQM25nYdCuFeT6VW1AE=; b=kcY1AbvIIa5HiUbxaxdtTwkC2kYlvzdNW+OSWofZ/jFVV6tY/wfJLyq57fofUkvzBd dObZJHtTNJezTd2au1iruUtp5QpLHQ7s+jqGLdhGDpAFSeWELqeFMtoag+6EEK+ICKfD jnDLPHM9Yyx7nZLghop1GTit77//Jwukk5hd2mYDn1VuPziYMXmUTT7dQ/TQ+pul+fkm o4FsAo6d1DvkAB512LVg7DW84k9DKnQe5ixG4PfKT+5v9w2cE8I0iY3FnsNfXwPnQ4i4 8LCUGef5siHjNsacjhdCTrFJIgcsdZTNptZVCkhKfJwNPa7skINeNaNqT4dTGwvEHwhv btqw== X-Received: by 10.220.151.141 with SMTP id c13mr37341765vcw.64.1360866460995; Thu, 14 Feb 2013 10:27:40 -0800 (PST) MIME-Version: 1.0 Received: by 10.59.8.227 with HTTP; Thu, 14 Feb 2013 10:27:00 -0800 (PST) In-Reply-To: References: <3e15a80e-2980-4eaf-b37e-ff7554582b38@email.android.com> From: Mohammad Tariq Date: Thu, 14 Feb 2013 23:57:00 +0530 Message-ID: Subject: Re: Host NameNode, DataNode, JobTracker or TaskTracker on the same machine To: "user@hadoop.apache.org" Content-Type: multipart/alternative; boundary=f46d043be026fe925904d5b369f5 X-Virus-Checked: Checked by ClamAV on apache.org --f46d043be026fe925904d5b369f5 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable With the current configuration you are safe. But as your data grows you will start consuming more space and eventually you might end with insufficient space to hold the metadata itself as it is also getting stored in the same disk. Also, bigger data means more no of files and blocks which means more no of object which in turn means greater memory consumption. And don't forget about the resource consumption of your processing layer. Like disk space required to store the intermediate output files, resources required to initiate map and reduce tasks etc. But it all depends upon the size of your data and the intensity of processing you are going to perform. As of now you look good to me with 128TB+64GB. HTH Warm Regards, Tariq https://mtariq.jux.com/ cloudfront.blogspot.com On Thu, Feb 14, 2013 at 11:35 PM, Jeff LI wrote: > Thanks for your response. I'm running SNN on another machine. > > Could you explain a bit more on why I may run out of memory or disk? > > I understand that NameNode holds file system metadata in memory. I found > through this post that ( > http://developer.yahoo.com/blogs/hadoop/posts/2010/05/scalability_of_the_= hadoop_dist/ > ) > as a rule of thumb, > 1 GB metadata =E2=89=88 1 PB physical storage > > Currently, my cluster has about 128TB of disk storage in total and 64GB > memory on each machine. Does this suggests that I'm protected against > running out of memory from metadata? > > Thanks > > Cheers > > Jeff > > > On Thu, Feb 14, 2013 at 12:41 PM, Tariq wrote: > >> You may run out of memory,out of disk. If SNN is also running on the sam= e >> machine then you are totally screwed in case of any breakdown >> >> shashwat shriparv wrote: >> >> >If you are doing it for production all the process should be running on >> >seperate machine as it will decrease the overload of the machine. >> > >> > >> > >> >=E2=88=9E >> >Shashwat Shriparv >> > >> > >> > >> >On Thu, Feb 14, 2013 at 10:40 PM, Jeff LI wrote: >> > >> >> Hello, >> >> >> >> Is there a good reason that we should not host NameNode, DataNode, >> >> JobTracker or TaskTracker services on the same machine? >> >> >> >> Not doing so is suggested here >> >http://wiki.apache.org/hadoop/NameNode, >> >> but I'd like to know the reasoning of this. >> >> >> >> Thanks >> >> >> >> Cheers >> >> >> >> Jeff >> >> >> >> >> >> -- >> Sent from my Android phone with K-9 Mail. Please excuse my brevity. >> > > --f46d043be026fe925904d5b369f5 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
With the current configuration you are safe. But as your d= ata grows you will start consuming more space and eventually you might end = with insufficient space to hold the metadata itself as it is also getting s= tored in the same disk. Also, bigger data means more no of files and blocks= which means more no of object which in turn means greater memory consumpti= on. And don't forget about the resource consumption of your processing = layer. Like disk space required to store the intermediate output files, res= ources required to initiate map and reduce tasks etc.

But it all depends upon the size of your data and the intens= ity of processing you are going to perform. As of now you look good to me w= ith 128TB+64GB.

HTH

Warm Regards,
Tariq
<= a href=3D"https://mtariq.jux.com/" target=3D"_blank">https://mtariq.jux.com= /


On Thu, Feb 14, 2013 at 11:35 PM, Jeff L= I <uniquejeff@gmail.com> wrote:
Thanks for your response. =C2=A0I'm running SNN on another machine.
Could you explain a bit more on why I may run out of memory= or disk?

I understand that NameNode holds file sy= stem metadata in memory. =C2=A0I found through this post that (http://developer.yahoo.com/blogs/hadoop/posts/= 2010/05/scalability_of_the_hadoop_dist/)=C2=A0
as a rule of thumb,=C2=A0
1 GB metadata =E2=89=88 1 PB physi= cal storage

Currently, my cluster has about 128TB = of disk storage in total and 64GB memory on each machine. =C2=A0Does this s= uggests that I'm protected against running out of memory from metadata?= =C2=A0

Thanks

Cheers

Jeff


On Thu, Feb 14, 2013 at 12:41 PM, Tariq <dontariq@gmail.com> wrote:
You may run out of memory,out of disk. If SN= N is also running on the same machine then you are totally screwed in case = of any breakdown

shashwat shriparv <dwivedishashwat@gmail.com> wrote:

>If you are doing it for production all the process should be running on=
>seperate machine as it will decrease the overload of the machine.
>
>
>
>=E2=88=9E
>Shashwat Shriparv
>
>
>
>On Thu, Feb 14, 2013 at 10:40 PM, Jeff LI <uniquejeff@gmail.com> wrote:
>
>> Hello,
>>
>> Is there a good reason that we should not host NameNode, DataNode,=
>> JobTracker or TaskTracker services on the same machine?
>>
>> Not doing so is suggested here
>ht= tp://wiki.apache.org/hadoop/NameNode,
>> but I'd like to know the reasoning of this.
>>
>> Thanks
>>
>> Cheers
>>
>> Jeff
>>
>>

--
Sent from my Android phone with K-9 Mail. Please excuse my brevity.


--f46d043be026fe925904d5b369f5--