hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Mackey <gmac...@cs.ucf.edu>
Subject Re: Namenode with External Storage?
Date Thu, 22 Oct 2009 19:16:24 GMT
oops, that is correct. My mistake

Quoting Sanjay Radia <sradia@yahoo-inc.com>:

> On Oct 22, 2009, at 9:37 AM, <gmackey@cs.ucf.edu> wrote:
>> As with Dhruba's comment, so long as it is just the namenode that is
>> running on a networked file system everything should be chill. The  namenode
>> keeps all of its working metadata in main mem, and it only  occasionally
>> pushes a log file out to hard storage (and if I remember correctly  you can
>> adjust this time window in one of the site files).
> Actually it pushes out the update logs on each and every update   
> synchronously.
> The checkpoint however is pushed out periodically.
> Also, at yahoo, we push out NN state to multiple disks and one of  
> the  "disks" is a nfs filer. This is configurable.
> sanjay
>> However, you are going to run into huge performance issues running
>> datanodes over a networked storage system. Having to push that many  file
>> requests over a network for a respectable mapreduce job is going to  kill
>> your equipment.
>> - Grant
>> On Oct 21 2009, Jonathan Seidman wrote:
>>> Apologies if this has been answered previously, but I'm unable to  find
>>> anything that seems to cover this.
>>> It's clear that datanodes require local storage for Hadoop to  function
>>> efficiently, but is there any significant disadvantage to using  external
>>> storage for namenodes? We're exploring the possibility of using a
>>> different class of hardware for our namenodes with attached  storage and
>>> little or no internal storage. Some of the benefits this would  provide us
>>> are: 1) allowing our sysadmins to deploy hardware that they're  familiar
>>> with and already have considerable experience keeping up in a  production
>>> environment. 2) no namenode downtime to replace a failed disk.
>>> We don't anticipate that this approach would cause any significant
>>> degradation to performance, but let me know if there's something  we're not
>>> considering.
>>> Thanks.
>>> Jonathan
>> --
>> --
>> Grant Mackey
>> PhD student Computer Engineering
>> University of Central Florida
>> Rm 231 cube 5 (321) 960-8851

Grant Mackey
UCF Research Assistant
Engineering III
Rm 238 Cubicle 1

This message was sent using IMP, the Internet Messaging Program.

View raw message