Return-Path: X-Original-To: apmail-hadoop-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1DF73E5F8 for ; Sat, 1 Dec 2012 02:25:54 +0000 (UTC) Received: (qmail 60111 invoked by uid 500); 1 Dec 2012 02:25:49 -0000 Delivered-To: apmail-hadoop-user-archive@hadoop.apache.org Received: (qmail 60019 invoked by uid 500); 1 Dec 2012 02:25:49 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 60002 invoked by uid 99); 1 Dec 2012 02:25:49 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 01 Dec 2012 02:25:49 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [209.85.223.180] (HELO mail-ie0-f180.google.com) (209.85.223.180) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 01 Dec 2012 02:25:44 +0000 Received: by mail-ie0-f180.google.com with SMTP id c10so1478621ieb.25 for ; Fri, 30 Nov 2012 18:25:24 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:x-gm-message-state; bh=LKsWDPl6zd+cQz7G/Z5ozKkpDKbqnvre/rWv4Y/mE5I=; b=SOnVjg4GN1zjGCgW8cWXP5/6LOquSYJ3yW7P15GKgcGb4d4bAg9L68j7ONE5/M3jsk HVChI081V4c93vgMyqv9Wi3Wq6+2/+5UaqA8PA+0l+F5XnHaOYHezeemEUDUV/+NUDIF xNdUFvjYOwT/bzFWUzFMDxbJ7+4lf0wrirtqX4GdnrN3eTpQ51Z+qwSzUCifD7cKZtO8 NfQbSVYz1HF1BvgFXTVpBPM/c1RczBw/ZsV/8LQ05xFVIDGuNmYTFQFOC6fQDqk6mdkR ywX19EqTIanIFxeKCxj66Z1AteKSGxeIzWOKuTSGLsQywBpeIKEGwFAhH5CkE+c23yq3 Ka5w== MIME-Version: 1.0 Received: by 10.50.36.194 with SMTP id s2mr258294igj.56.1354328723998; Fri, 30 Nov 2012 18:25:23 -0800 (PST) Received: by 10.64.97.36 with HTTP; Fri, 30 Nov 2012 18:25:23 -0800 (PST) In-Reply-To: References: Date: Fri, 30 Nov 2012 21:25:23 -0500 Message-ID: Subject: Re: CheckPoint Node From: Jean-Marc Spaggiari To: user@hadoop.apache.org Content-Type: text/plain; charset=UTF-8 X-Gm-Message-State: ALoCoQmGvqF67wjHlebOe2DQlB6WyQ9ryNY5TOiL5vxQwADN1a19njDe5cqTv0hFrp+Sc6sV0hO+ X-Virus-Checked: Checked by ClamAV on apache.org Sorry about that. My fault. I have put this on the core-site.xml file but should be on the hdfs-site.xml... I moved it and it's now working fine. Thanks. JM 2012/11/30, Jean-Marc Spaggiari : > Hi, > > Is there a way to ask Hadoop to display its parameters? > > I have updated the property as followed: > > dfs.name.dir > ${hadoop.tmp.dir}/dfs/name,/media/usb0/ > > > But even if I stop/start hadoop, there is nothing written on the usb > drive. So I'm wondering if there is a command line like bin/hadoop > --showparameters > > Thanks, > > JM > > 2012/11/22, Jean-Marc Spaggiari : >> Perfect. Thanks again for your time! >> >> I will first add another drive on the Namenode because this will take >> 5 minutes. Then I will read about the migration from 1.0.3 to 2.0.x >> and most probably will use the zookeeper solution. >> >> This will take more time, so will be done over the week-end. >> >> I lost 2 hard drives this week (2 datanodes), so I'm not a bit >> concerned about the NameNode data. Just want to secure that a bit >> more. >> >> JM >> >> 2012/11/22, Harsh J : >>> Jean-Marc (Sorry if I've been spelling your name wrong), >>> >>> 0.94 does support Hadoop-2 already, and works pretty well with it, if >>> that is your only concern. You only need to use the right download (or >>> if you compile, use the -Dhadoop.profile=23 maven option). >>> >>> You will need to restart the NameNode to make changes to the >>> dfs.name.dir property and set it into effect. A reasonably fast disk >>> is needed for quicker edit log writes (few bytes worth in each round) >>> but a large, or SSD-style disk is not a requisite. An external disk >>> would work fine too (instead of an NFS), as long as it is reliable. >>> >>> You do not need to copy data manually - just ensure that your NameNode >>> process user owns the directory and it will auto-populate the empty >>> directory on startup. >>> >>> Operationally speaking, in case 1/2 disk fails, the NN Web UI (and >>> metrics as well) will indicate this (see bottom of NN UI page for an >>> example of what am talking about) but the NN will continue to run with >>> the lone remaining disk, but its not a good idea to let it run for too >>> long without fixing/replacing the disk, for you will be losing out on >>> redundancy. >>> >>> On Thu, Nov 22, 2012 at 11:59 PM, Jean-Marc Spaggiari >>> wrote: >>>> Hi Harsh, >>>> >>>> Again, thanks a lot for all those details. >>>> >>>> I read the previous link and I totally understand the HA NameNode. I >>>> already have a zookeeper quorum (3 servers) that I will be able to >>>> re-use. However, I'm running HBase 0.94.2 which is not yet compatible >>>> (I think) with Hadoop 2.0.x. So I will have to go with a non-HA >>>> NameNode until I can migrate to a stable 0.96 HBase version. >>>> >>>> Can I "simply" add one directory to dfs.name.dir and restart >>>> my namenode? Is it going to feed all the required information in this >>>> directory? Or do I need to copy the data of the existing one in the >>>> new one before I restart it? Also, does it need a fast transfert rate? >>>> Or will an exteral hard drive (quick to be moved to another server if >>>> required) be enought? >>>> >>>> >>>> 2012/11/22, Harsh J : >>>>> Please follow the tips provided at >>>>> http://wiki.apache.org/hadoop/FAQ#How_do_I_set_up_a_hadoop_node_to_use_multiple_volumes.3Fand >>>>> http://wiki.apache.org/hadoop/FAQ#If_the_NameNode_loses_its_only_copy_of_the_fsimage_file.2C_can_the_file_system_be_recovered_from_the_DataNodes.3F >>>>> >>>>> In short, if you use a non-HA NameNode setup: >>>>> >>>>> - Yes the NN is a very vital persistence point in running HDFS and its >>>>> data should be redundantly stored for safety. >>>>> - You should, in production, configure your NameNode's image and edits >>>>> disk (dfs.name.dir in 1.x+, or dfs.namenode.name.dir in 0.23+/2.x+) to >>>>> be a dedicated one with adequate free space for gradual growth, and >>>>> should configure multiple disks (with one off-machine NFS point highly >>>>> recommended for easy recovery) for adequate redundancy. >>>>> >>>>> If you instead use a HA NameNode setup (I'd highly recommend doing >>>>> this since it is now available), the presence of > 1 NameNodes and the >>>>> journal log mount or quorum setup would automatically act as >>>>> safeguards for the FS metadata. >>>>> >>>>> On Thu, Nov 22, 2012 at 11:03 PM, Jean-Marc Spaggiari >>>>> wrote: >>>>>> Hi Harsh, >>>>>> >>>>>> Thanks for pointing me to this link. I will take a close look at it. >>>>>> >>>>>> So with 1.x and 0.23.x, what's the impact on the data if the namenode >>>>>> server hard-drive die? Is there any critical data stored locally? Or >>>>>> I >>>>>> simply need to build a new namenode, start it and restart all my >>>>>> namenodes to find my data back? >>>>>> >>>>>> I can deal with my application not beeing available, but loosing data >>>>>> can be a bigger issue. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> JM >>>>>> >>>>>> 2012/11/22, Harsh J : >>>>>>> Hey Jean, >>>>>>> >>>>>>> The 1.x, 0.23.x release lines both don't have NameNode HA features. >>>>>>> The current 2.x releases carry HA-NN abilities, and this is >>>>>>> documented >>>>>>> at >>>>>>> http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailability.html. >>>>>>> >>>>>>> On Thu, Nov 22, 2012 at 10:18 PM, Jean-Marc Spaggiari >>>>>>> wrote: >>>>>>>> Replying to myself ;) >>>>>>>> >>>>>>>> By digging a bit more I figured that 1.0 version is older than >>>>>>>> 0.23.4 >>>>>>>> version and that backupnodes are on 0.23.4. Secondarynamenodes on >>>>>>>> 1.0 >>>>>>>> are now deprecated. >>>>>>>> >>>>>>>> I'm still a bit mixed up on the way to achieve HA for the namenode >>>>>>>> (1.0 or 0.23.4) but I will continue to dig over internet. >>>>>>>> >>>>>>>> JM >>>>>>>> >>>>>>>> 2012/11/22, Jean-Marc Spaggiari : >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> I'm reading a bit about hadoop and I'm trying to increase the HA >>>>>>>>> of >>>>>>>>> my >>>>>>>>> current cluster. >>>>>>>>> >>>>>>>>> Today I have 8 datanodes and one namenode. >>>>>>>>> >>>>>>>>> By reading here: http://www.aosabook.org/en/hdfs.html I can see >>>>>>>>> that >>>>>>>>> a >>>>>>>>> Checkpoint node might be a good idea. >>>>>>>>> >>>>>>>>> So I'm trying to start a checkpoint node. I looked at the hadoop >>>>>>>>> online doc. There is a link toe describe the command usage "For >>>>>>>>> command usage, see namenode." but this link is not working. Also, >>>>>>>>> if >>>>>>>>> I >>>>>>>>> try hadoop-deamon.sh start namenode -checkpoint as discribed in >>>>>>>>> the >>>>>>>>> documentation, it's not starting. >>>>>>>>> >>>>>>>>> So I'n wondering, is there anywhere where I can find up to date >>>>>>>>> documentation about the checkpoint node? I will most probably try >>>>>>>>> the >>>>>>>>> BackupNode. >>>>>>>>> >>>>>>>>> I'm using hadoop 1.0.3. The options I have to start on this >>>>>>>>> version >>>>>>>>> are namenode, secondarynamenode, datanode, dfsadmin, mradmin, fsck >>>>>>>>> and >>>>>>>>> fs. Should I start some secondarynamenodes instead of backupnode >>>>>>>>> and >>>>>>>>> checkpointnode? >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> JM >>>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Harsh J >>>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Harsh J >>>>> >>> >>> >>> >>> -- >>> Harsh J >>> >> >