Return-Path: X-Original-To: apmail-hadoop-common-user-archive@www.apache.org Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id BA367E6D5 for ; Thu, 28 Feb 2013 11:09:56 +0000 (UTC) Received: (qmail 90652 invoked by uid 500); 28 Feb 2013 11:09:51 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 90097 invoked by uid 500); 28 Feb 2013 11:09:50 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Delivered-To: moderator for user@hadoop.apache.org Received: (qmail 59933 invoked by uid 99); 28 Feb 2013 10:58:39 -0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,NORMAL_HTTP_TO_IP,RCVD_IN_DNSWL_LOW,SPF_PASS,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of project.linux.proj@gmail.com designates 209.85.128.180 as permitted sender) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=kLeyoBUQoADZodcufjdrdLIPOptz++rDdCeEZuOvQio=; b=glhBPs/CGDmAvlum0zE9d5jJV/UlBr3ssoYGYGnq8YNq6EA0gyQZvGAeedqmh2sbaR tlOUILFmLtNW1WSagpz+u5lrJrHZ42rZxHAXRL8hNF6ZLqPGC6QlWHLDnlIizBf2REpw vjdFFWa+Ab+xXrapbCrhB8G1g6GEfsDGVzv7YNfnrDWHTHZS6PXtt8ep+MaeQaIJjtxz 6d+IrrE8weI/K9Q2h2HOflGpldd1KH+VCKGdJeSYA/UAiKc5vwKS/9mR9c3gGysBDIgP +IaHYwKvfQF7ITF9cf2rrTsaRtz5xZtZybF9PEn2MtAffD5IMBHCSH7czhe4uXDbjTzg ZEMw== MIME-Version: 1.0 X-Received: by 10.58.28.169 with SMTP id c9mr2396598veh.5.1362049093201; Thu, 28 Feb 2013 02:58:13 -0800 (PST) In-Reply-To: References: Date: Thu, 28 Feb 2013 16:28:13 +0530 Message-ID: Subject: Re: NameNode low on available disk space From: Mohit Vadhera To: Harsh J Cc: "" Content-Type: multipart/alternative; boundary=047d7b6da95a5e2aa504d6c6c486 X-Virus-Checked: Checked by ClamAV on apache.org --047d7b6da95a5e2aa504d6c6c486 Content-Type: text/plain; charset=ISO-8859-1 Even i created the file /mnt/san1/hdfs/cache/hdfs/dfs/name/in_use.lock and set permission . when i restart hadoop services. It removes and I find below logs. Do I need to format the NN? Below is the command to format the NN ? Any kind of loss while formatting ? Is there any way to avoid formatting and change the cache path ? 2013-02-28 05:57:50,902 INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on /mnt/san1/hdfs/cache/hdfs/dfs/name/in_use.lock acquired by nodename 81133@OPERA-MAST1.ny.os.local 2013-02-28 05:57:50,904 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping NameNode metrics system... 2013-02-28 05:57:50,904 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system stopped. 2013-02-28 05:57:50,904 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system shutdown complete. 2013-02-28 05:57:50,905 FATAL org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in namenode join java.io.IOException: NameNode is not formatted. Command to format the NN. sudo -u hdfs hdfs namenode -format Thanks, On Thu, Feb 28, 2013 at 3:47 PM, Mohit Vadhera wrote: > After creating the directory and setting permission I tried to restart the > services and i get error "/mnt/san1/hdfs/cache/hdfs/dfs/name/in_use.lock > acquired by nodename 7275@OPERA-MAST1.ny.os.local" and services are not > being started. > > Need to check few logs from below logs. > =================================== > 2013-02-28 05:06:24,905 WARN org.apache.hadoop.hdfs.server.common.Util: > Path /mnt/san1/hdfs/cache/hdfs/dfs/name should be specified as a URI in > configuration files. Please update hdfs configuration. > 2013-02-28 05:06:24,905 WARN org.apache.hadoop.hdfs.server.common.Util: > Path /mnt/san1/hdfs/cache/hdfs/dfs/name should be specified as a URI in > configuration files. Please update hdfs configuration. > 2013-02-28 05:06:24,906 WARN > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Only one image storage > directory (dfs.namenode.name.dir) configured. Beware of dataloss due to > lack of redundant sto > rage directories! > 2013-02-28 05:06:24,906 WARN > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Only one namespace > edits storage directory (dfs.namenode.edits.dir) configured. Beware of > dataloss due to lack of re > dundant storage directories! > > > ************************************************************/ > 2013-02-28 05:06:23,385 WARN > org.apache.hadoop.metrics2.impl.MetricsConfig: Cannot locate configuration: > tried hadoop-metrics2-namenode.properties,hadoop-metrics2.properties > 2013-02-28 05:06:23,556 INFO > org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot > period at 10 second(s). > 2013-02-28 05:06:23,556 INFO > org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system > started > 2013-02-28 05:06:24,905 WARN org.apache.hadoop.hdfs.server.common.Util: > Path /mnt/san1/hdfs/cache/hdfs/dfs/name should be specified as a URI in > configuration files. Please update hdfs configuration. > 2013-02-28 05:06:24,905 WARN org.apache.hadoop.hdfs.server.common.Util: > Path /mnt/san1/hdfs/cache/hdfs/dfs/name should be specified as a URI in > configuration files. Please update hdfs configuration. > 2013-02-28 05:06:24,906 WARN > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Only one image storage > directory (dfs.namenode.name.dir) configured. Beware of dataloss due to > lack of redundant sto > rage directories! > 2013-02-28 05:06:24,906 WARN > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Only one namespace > edits storage directory (dfs.namenode.edits.dir) configured. Beware of > dataloss due to lack of re > dundant storage directories! > 2013-02-28 05:06:25,618 INFO org.apache.hadoop.util.HostsFileReader: > Refreshing hosts (include/exclude) list > 2013-02-28 05:06:25,623 INFO > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: > dfs.block.invalidate.limit=1000 > 2013-02-28 05:06:26,015 INFO > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: > dfs.block.access.token.enable=false > 2013-02-28 05:06:26,015 INFO > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: > defaultReplication = 1 > 2013-02-28 05:06:26,015 INFO > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: maxReplication > = 512 > 2013-02-28 05:06:26,015 INFO > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: minReplication > = 1 > 2013-02-28 05:06:26,015 INFO > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: > maxReplicationStreams = 2 > 2013-02-28 05:06:26,016 INFO > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: > shouldCheckForEnoughRacks = false > 2013-02-28 05:06:26,016 INFO > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: > replicationRecheckInterval = 3000 > 2013-02-28 05:06:26,016 INFO > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: > encryptDataTransfer = false > 2013-02-28 05:06:26,022 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner = > hdfs (auth:SIMPLE) > 2013-02-28 05:06:26,022 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup = > hadmin > 2013-02-28 05:06:26,022 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isPermissionEnabled = > true > 2013-02-28 05:06:26,023 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: HA Enabled: false > 2013-02-28 05:06:26,026 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Append Enabled: true > 2013-02-28 05:06:26,359 INFO > org.apache.hadoop.hdfs.server.namenode.NameNode: Caching file names > occuring more than 10 times > 2013-02-28 05:06:26,361 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: > dfs.namenode.safemode.threshold-pct = 0.9990000128746033 > 2013-02-28 05:06:26,361 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: > dfs.namenode.safemode.min.datanodes = 0 > 2013-02-28 05:06:26,361 INFO > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: > dfs.namenode.safemode.extension = 0 > 2013-02-28 05:06:26,378 INFO org.apache.hadoop.hdfs.server.common.Storage: > Lock on /mnt/san1/hdfs/cache/hdfs/dfs/name/in_use.lock acquired by nodename > 7275@OPERA-MAST1.ny.os.local > 2013-02-28 05:06:26,381 INFO > org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping NameNode > metrics system... > 2013-02-28 05:06:26,381 INFO > org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system > stopped. > 2013-02-28 05:06:26,381 INFO > org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system > shutdown complete. > 2013-02-28 05:06:26,382 FATAL > org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in namenode join > java.io.IOException: NameNode is not formatted. > at > org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:211) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:534) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:424) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:386) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:398) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:432) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:608) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:589) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1140) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1204) > 2013-02-28 05:06:26,385 INFO org.apache.hadoop.util.ExitUtil: Exiting with > status 1 > 2013-02-28 05:06:26,394 INFO > org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG: > /************************************************************ > SHUTDOWN_MSG: Shutting down NameNode at localtest/192.168.1.3 > > > > On Thu, Feb 28, 2013 at 3:18 PM, Mohit Vadhera < > project.linux.proj@gmail.com> wrote: > >> Thanks Harsh, /mnt/san1/hdfs/cache/hdfs/dfs/name is not being created . >> If I do compare with the older path the permissions are same on the parent >> directories. >> Do I need to create this this directory manually and set the permission ? >> >> Older Path >> >> # ll /var/lib/hadoop-hdfs/cache/hdfs/ >> total 4 >> drwxr-xr-x. 5 hdfs hdfs 4096 Dec 27 11:34 dfs >> >> # ll /var/lib/hadoop-hdfs/cache/hdfs/dfs/ >> total 12 >> drwx------. 3 hdfs hdfs 4096 Dec 19 02:37 data >> drwxr-xr-x. 3 hdfs hdfs 4096 Feb 28 02:36 name >> drwxr-xr-x. 3 hdfs hdfs 4096 Feb 28 02:36 namesecondary >> >> >> New Path >> >> # ll /mnt/san1/hdfs/cache/hdfs/ >> total 4 >> drwxr-xr-x 3 hdfs hdfs 4096 Feb 28 02:08 dfs >> >> >> # ll /mnt/san1/hdfs/cache/hdfs/dfs/ >> total 4 >> drwxr-xr-x 2 hdfs hdfs 4096 Feb 28 02:36 namesecondary >> >> >> Thanks, >> >> >> >> On Thu, Feb 28, 2013 at 1:59 PM, Harsh J wrote: >> >>> Hi, >>> >>> The exact error is displayed on your log and should be somewhat self >>> explanatory: >>> >>> org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: >>> Directory /mnt/san1/hdfs/cache/hdfs/dfs/name is in an inconsistent >>> state: storage directory does not exist or is not accessible. >>> >>> Please check this one's availability, permissions (the NN user should >>> be able to access it). >>> >>> On Thu, Feb 28, 2013 at 1:46 PM, Mohit Vadhera >>> wrote: >>> > Please find below logs for shutting down the namenode service. Can >>> anybody >>> > check this >>> > >>> > 2013-02-28 02:07:51,752 WARN >>> org.apache.hadoop.hdfs.server.common.Util: Path >>> > /mnt/san1/hdfs/cache/hdfs/dfs/name should be specified as a URI in >>> > configuration files. Please update hdfs configuration. >>> > 2013-02-28 02:07:51,754 WARN >>> org.apache.hadoop.hdfs.server.common.Util: Path >>> > /mnt/san1/hdfs/cache/hdfs/dfs/name should be specified as a URI in >>> > configuration files. Please update hdfs configuration. >>> > 2013-02-28 02:07:51,754 WARN >>> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Only one image >>> storage >>> > directory (dfs.namenode.name.dir) configured. Beware of dataloss due >>> to lack >>> > of redundant storage directories! >>> > 2013-02-28 02:07:51,754 WARN >>> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Only one namespace >>> > edits storage directory (dfs.namenode.edits.dir) configured. Beware of >>> > dataloss due to lack of redundant storage directories! >>> > 2013-02-28 02:07:51,884 INFO org.apache.hadoop.util.HostsFileReader: >>> > Refreshing hosts (include/exclude) list >>> > 2013-02-28 02:07:51,890 INFO >>> > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: >>> > dfs.block.invalidate.limit=1000 >>> > 2013-02-28 02:07:51,909 INFO >>> > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: >>> > dfs.block.access.token.enable=false >>> > 2013-02-28 02:07:51,910 INFO >>> > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: >>> > defaultReplication = 1 >>> > 2013-02-28 02:07:51,910 INFO >>> > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: >>> maxReplication >>> > = 512 >>> > 2013-02-28 02:07:51,910 INFO >>> > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: >>> minReplication >>> > = 1 >>> > 2013-02-28 02:07:51,910 INFO >>> > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: >>> > maxReplicationStreams = 2 >>> > 2013-02-28 02:07:51,910 INFO >>> > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: >>> > shouldCheckForEnoughRacks = false >>> > 2013-02-28 02:07:51,910 INFO >>> > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: >>> > replicationRecheckInterval = 3000 >>> > 2013-02-28 02:07:51,910 INFO >>> > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: >>> > encryptDataTransfer = false >>> > 2013-02-28 02:07:51,920 INFO >>> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner >>> = >>> > hdfs (auth:SIMPLE) >>> > 2013-02-28 02:07:51,920 INFO >>> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup >>> = >>> > hadmin >>> > 2013-02-28 02:07:51,920 INFO >>> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: >>> isPermissionEnabled = >>> > true >>> > 2013-02-28 02:07:51,920 INFO >>> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: HA Enabled: false >>> > 2013-02-28 02:07:51,925 INFO >>> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Append Enabled: >>> true >>> > 2013-02-28 02:07:52,462 INFO >>> > org.apache.hadoop.hdfs.server.namenode.NameNode: Caching file names >>> occuring >>> > more than 10 times >>> > 2013-02-28 02:07:52,466 INFO >>> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: >>> > dfs.namenode.safemode.threshold-pct = 0.9990000128746033 >>> > 2013-02-28 02:07:52,467 INFO >>> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: >>> > dfs.namenode.safemode.min.datanodes = 0 >>> > 2013-02-28 02:07:52,467 INFO >>> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: >>> > dfs.namenode.safemode.extension = 0 >>> > 2013-02-28 02:07:52,469 INFO >>> org.apache.hadoop.hdfs.server.common.Storage: >>> > Storage directory /mnt/san1/hdfs/cache/hdfs/dfs/name does not exist. >>> > 2013-02-28 02:07:52,471 INFO >>> > org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping NameNode >>> metrics >>> > system... >>> > 2013-02-28 02:07:52,472 INFO >>> > org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics >>> system >>> > stopped. >>> > 2013-02-28 02:07:52,473 INFO >>> > org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics >>> system >>> > shutdown complete. >>> > 2013-02-28 02:07:52,473 FATAL >>> > org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in namenode >>> join >>> > org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: >>> Directory >>> > /mnt/san1/hdfs/cache/hdfs/dfs/name is in an inconsistent state: storage >>> > directory does not exist or is not accessible. >>> > at >>> > >>> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverStorageDirs(FSImage.java:295) >>> > at >>> > >>> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:201) >>> > at >>> > >>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:534) >>> > at >>> > >>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:424) >>> > at >>> > >>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:386) >>> > at >>> > >>> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:398) >>> > at >>> > >>> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:432) >>> > at >>> > >>> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:608) >>> > at >>> > >>> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:589) >>> > at >>> > >>> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1140) >>> > at >>> > >>> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1204) >>> > 2013-02-28 02:08:48,908 INFO org.apache.hadoop.util.ExitUtil: Exiting >>> with >>> > status 1 >>> > 2013-02-28 02:08:48,913 INFO >>> > org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG: >>> > /************************************************************ >>> > SHUTDOWN_MSG: Shutting down NameNode at OPERA-MAST1.ny.os.local/ >>> 192.168.1.3 >>> > >>> > >>> > On Thu, Feb 28, 2013 at 1:27 PM, Mohit Vadhera >>> > wrote: >>> >> >>> >> Hi Guys, >>> >> >>> >> I have space on other partition. Can I change the path for cache >>> files on >>> >> other partition ? I have below properties . Can it resolve the issue >>> ? If i >>> >> change the path to other directories and restart services I get the >>> below >>> >> error while starting the service namenode. I didn't find anything in >>> logs so >>> >> far. Can you please suggest something ? >>> >> >>> >> >>> >> hadoop.tmp.dir >>> >> /var/lib/hadoop-hdfs/cache/${user.name} >>> >> >>> >> >>> >> dfs.namenode.name.dir >>> >> /var/lib/hadoop-hdfs/cache/${user.name}/dfs/name >>> >> >>> >> >>> >> dfs.namenode.checkpoint.dir >>> >> >>> >> /var/lib/hadoop-hdfs/cache/${user.name >>> }/dfs/namesecondary >>> >> >>> >> >>> >> >>> >> >>> >> Service namenode is failing >>> >> >>> >> # for service in /etc/init.d/hadoop-hdfs-* ; do sudo $service status; >>> done >>> >> Hadoop datanode is running [ OK ] >>> >> Hadoop namenode is dead and pid file exists [FAILED] >>> >> Hadoop secondarynamenode is running [ OK ] >>> >> >>> >> Thanks, >>> >> >>> >> >>> >> >>> >> On Wed, Jan 23, 2013 at 11:15 PM, Mohit Vadhera >>> >> wrote: >>> >>> >>> >>> >>> >>> On Wed, Jan 23, 2013 at 10:41 PM, Harsh J >>> wrote: >>> >>>> >>> >>>> http://NNHOST:50070/conf >>> >>> >>> >>> >>> >>> >>> >>> Harsh, I changed the value as said & restarted service NN. For >>> verifying >>> >>> i checked the http link that you gave and i saw the property their >>> but on >>> >>> http://NNHOST:50070 i noticed warning( WARNING : There are 4 >>> missing >>> >>> blocks. Please check the logs or run fsck in order to identify the >>> missing >>> >>> blocks.) when i clicked on this link i can see file names . Do I >>> need to >>> >>> reboot the machine to run fsck on root fs/ or is there hadoop >>> command fsck >>> >>> that i can run on the running hadoop ? >>> >>> >>> >>> Thanks, >>> >>> >>> >> >>> > >>> >>> >>> >>> -- >>> Harsh J >>> >> >> > --047d7b6da95a5e2aa504d6c6c486 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
=A0Even i created the file=A0/mnt/san1/hdfs/cache/hdfs/dfs= /name/in_use.lock=A0and set permission . when i restart hadoop services. It= removes and I find below logs.=A0

Do I need to format t= he NN?
Below is the command to format the NN ?=A0
Any kind of loss while= formatting ?=A0
Is there any way to avoid formatting and change = the cache path ?=A0

2013-02-28 05:57:50,902 INFO or= g.apache.hadoop.hdfs.server.common.Storage: Lock on /mnt/san1/hdfs/cache/hd= fs/dfs/name/in_use.lock acquired by nodename 81133@OPERA-MAST1.ny.os.local<= /div>
2013-02-28 05:57:50,904 INFO org.apache.hadoop.metrics2.impl.MetricsSy= stemImpl: Stopping NameNode metrics system...
2013-02-28 05:57:50= ,904 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metri= cs system stopped.
2013-02-28 05:57:50,904 INFO org.apache.hadoop.metrics2.impl.MetricsSy= stemImpl: NameNode metrics system shutdown complete.
2013-02-28 0= 5:57:50,905 FATAL org.apache.hadoop.hdfs.server.namenode.NameNode: Exceptio= n in namenode join
java.io.IOException: NameNode is not formatted.

Command to format the NN.

sudo -u hdfs hdfs namenode -forma= t

Thanks,


On Thu, Feb 28, 2013 at= 3:47 PM, Mohit Vadhera <project.linux.proj@gmail.com> wrote:
After creating the director= y and setting permission I tried to restart the services and i get error &q= uot;/mnt/san1/hdfs/cache/hdfs/dfs/name/in_use.lock acquired by nodename 727= 5@OPERA-MAST1.ny.os.local" and services are not being started.=A0

Need to check few logs from below logs.
=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D
2013-02-28 05:06:24,905 WARN org.apach= e.hadoop.hdfs.server.common.Util: Path /mnt/san1/hdfs/cache/hdfs/dfs/name s= hould be specified as a URI in configuration files. Please update hdfs conf= iguration.
2013-02-28 05:06:24,905 WARN org.apache.hadoop.hdfs.server.common.Util= : Path /mnt/san1/hdfs/cache/hdfs/dfs/name should be specified as a URI in c= onfiguration files. Please update hdfs configuration.
2013-02-28 = 05:06:24,906 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Only= one image storage directory (dfs.namenode.name.dir) configured. Beware of = dataloss due to lack of redundant sto
rage directories!
2013-02-28 05:06:24,906 WARN org.apache.ha= doop.hdfs.server.namenode.FSNamesystem: Only one namespace edits storage di= rectory (dfs.namenode.edits.dir) configured. Beware of dataloss due to lack= of re
dundant storage directories!


<= div>************************************************************/
2013-02-28 05:06:23,385 WARN org.apache.hadoop.metrics2.impl.MetricsConfig= : Cannot locate configuration: tried hadoop-metrics2-namenode.properties,ha= doop-metrics2.properties
2013-02-28 05:06:23,556 INFO org.apache.hadoop.metrics2.impl.MetricsSy= stemImpl: Scheduled snapshot period at 10 second(s).
2013-02-28 0= 5:06:23,556 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNod= e metrics system started
2013-02-28 05:06:24,905 WARN org.apache.hadoop.hdfs.server.common.Util= : Path /mnt/san1/hdfs/cache/hdfs/dfs/name should be specified as a URI in c= onfiguration files. Please update hdfs configuration.
2013-02-28 = 05:06:24,905 WARN org.apache.hadoop.hdfs.server.common.Util: Path /mnt/san1= /hdfs/cache/hdfs/dfs/name should be specified as a URI in configuration fil= es. Please update hdfs configuration.
2013-02-28 05:06:24,906 WARN org.apache.hadoop.hdfs.server.namenode.FS= Namesystem: Only one image storage directory (dfs.namenode.name.dir) config= ured. Beware of dataloss due to lack of redundant sto
rage direct= ories!
2013-02-28 05:06:24,906 WARN org.apache.hadoop.hdfs.server.namenode.FS= Namesystem: Only one namespace edits storage directory (dfs.namenode.edits.= dir) configured. Beware of dataloss due to lack of re
dundant sto= rage directories!
2013-02-28 05:06:25,618 INFO org.apache.hadoop.util.HostsFileReader: R= efreshing hosts (include/exclude) list
2013-02-28 05:06:25,623 IN= FO org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: dfs.block= .invalidate.limit=3D1000
2013-02-28 05:06:26,015 INFO org.apache.hadoop.hdfs.server.blockmanage= ment.BlockManager: dfs.block.access.token.enable=3Dfalse
2013-02-= 28 05:06:26,015 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockMan= ager: defaultReplication =A0 =A0 =A0 =A0 =3D 1
2013-02-28 05:06:26,015 INFO org.apache.hadoop.hdfs.server.blockmanage= ment.BlockManager: maxReplication =A0 =A0 =A0 =A0 =A0 =A0 =3D 512
2013-02-28 05:06:26,015 INFO org.apache.hadoop.hdfs.server.blockmanagement= .BlockManager: minReplication =A0 =A0 =A0 =A0 =A0 =A0 =3D 1
2013-02-28 05:06:26,015 INFO org.apache.hadoop.hdfs.server.blockmanage= ment.BlockManager: maxReplicationStreams =A0 =A0 =A0=3D 2
2013-02= -28 05:06:26,016 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockMa= nager: shouldCheckForEnoughRacks =A0=3D false
2013-02-28 05:06:26,016 INFO org.apache.hadoop.hdfs.server.blockmanage= ment.BlockManager: replicationRecheckInterval =3D 3000
2013-02-28= 05:06:26,016 INFO org.apache.hadoop.hdfs.server.blockmanagement.BlockManag= er: encryptDataTransfer =A0 =A0 =A0 =A0=3D false
2013-02-28 05:06:26,022 INFO org.apache.hadoop.hdfs.server.namenode.FS= Namesystem: fsOwner =A0 =A0 =A0 =A0 =A0 =A0 =3D hdfs (auth:SIMPLE)
2013-02-28 05:06:26,022 INFO org.apache.hadoop.hdfs.server.namenode.FSNam= esystem: supergroup =A0 =A0 =A0 =A0 =A0=3D hadmin
2013-02-28 05:06:26,022 INFO org.apache.hadoop.hdfs.server.namenode.FS= Namesystem: isPermissionEnabled =3D true
2013-02-28 05:06:26,023 = INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: HA Enabled: false=
2013-02-28 05:06:26,026 INFO org.apache.hadoop.hdfs.server.namenode.FS= Namesystem: Append Enabled: true
2013-02-28 05:06:26,359 INFO org= .apache.hadoop.hdfs.server.namenode.NameNode: Caching file names occuring m= ore than 10 times
2013-02-28 05:06:26,361 INFO org.apache.hadoop.hdfs.server.namenode.FS= Namesystem: dfs.namenode.safemode.threshold-pct =3D 0.9990000128746033
2013-02-28 05:06:26,361 INFO org.apache.hadoop.hdfs.server.namenode.F= SNamesystem: dfs.namenode.safemode.min.datanodes =3D 0
2013-02-28 05:06:26,361 INFO org.apache.hadoop.hdfs.server.namenode.FS= Namesystem: dfs.namenode.safemode.extension =A0 =A0 =3D 0
2013-02= -28 05:06:26,378 INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on= /mnt/san1/hdfs/cache/hdfs/dfs/name/in_use.lock acquired by nodename 7275@O= PERA-MAST1.ny.os.local
2013-02-28 05:06:26,381 INFO org.apache.hadoop.metrics2.impl.MetricsSy= stemImpl: Stopping NameNode metrics system...
2013-02-28 05:06:26= ,381 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metri= cs system stopped.
2013-02-28 05:06:26,381 INFO org.apache.hadoop.metrics2.impl.MetricsSy= stemImpl: NameNode metrics system shutdown complete.
2013-02-28 0= 5:06:26,382 FATAL org.apache.hadoop.hdfs.server.namenode.NameNode: Exceptio= n in namenode join
java.io.IOException: NameNode is not formatted.
=A0 =A0 =A0 = =A0 at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead= (FSImage.java:211)
=A0 =A0 =A0 =A0 at org.apach= e.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:53= 4)
=A0 =A0 =A0 =A0 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem= .loadFromDisk(FSNamesystem.java:424)
=A0 =A0 =A0 =A0 at org.apach= e.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:3= 86)
=A0 =A0 =A0 =A0 at org.apache.hadoop.hdfs.server.namenode.NameNode.loadName= system(NameNode.java:398)
=A0 =A0 =A0 =A0 at org.apache.hadoop.hd= fs.server.namenode.NameNode.initialize(NameNode.java:432)
=A0 =A0= =A0 =A0 at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(Na= meNode.java:608)
=A0 =A0 =A0 =A0 at org.apache.hadoop.hdfs.server.namenode.NameNode.<= ;init>(NameNode.java:589)
=A0 =A0 =A0 =A0 at org.apache.hadoop= .hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1140)
=A0 =A0 =A0 =A0 at org.apache.hadoop.hdfs.server.namenode.NameNode.main(Na= meNode.java:1204)
2013-02-28 05:06:26,385 INFO org.apache.hadoop.util.ExitUtil: Ex= iting with status 1
2013-02-28 05:06:26,394 INFO org.apache.hadoo= p.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
/*****************= *******************************************
SHUTDOWN_MSG: Shutting down NameNode at localtest/192.168.1.3

=

On Thu, Feb 28, 2013 at 3:18 PM, Mohit Vadhera <= span dir=3D"ltr"><project.linux.proj@gmail.com> wrote:
Thanks Harsh,=A0=A0/mnt= /san1/hdfs/cache/hdfs/dfs/name is not being created . If I do compare with the older path th= e=A0permissions=A0are same on the parent directories.=A0
Do I need to create this this directory manually and set the permission= ?

Older Path

# ll /var/lib/= hadoop-hdfs/cache/hdfs/
total 4
drwxr-xr-x. 5 hdfs hdfs 4096 Dec 27 11:34 dfs

# ll /var/lib/hadoop-hdfs/cache/hdfs/dfs/
total 12
drwx------. 3 hdf= s hdfs 4096 Dec 19 02:37 data
drwxr-xr-x. 3 hdfs hdfs 4096 Feb 28 02:36 name
drwxr-xr-x. 3 hdfs = hdfs 4096 Feb 28 02:36 namesecondary


New Path

# ll /mnt/san1/hdfs/cache/hdfs/
total 4
drwxr-xr-x 3 hdfs = hdfs 4096 Feb 28 02:08 dfs


# ll /mnt/san1/hdfs/cache/hdfs/dfs/
total 4
drwxr-xr-x 2 hdfs h= dfs 4096 Feb 28 02:36 namesecondary


Thanks,



On Thu, Feb 28, 2013 at 1:59 PM, Harsh J= <harsh@cloudera.com> wrote:
Hi,

The exact error is displayed on your log and should be somewhat self
explanatory:

org.apache.hadoop.hdfs.server.common.InconsistentFSStateException:
Directory /mnt/san1/hdfs/cache/hdfs/dfs/name is in an inconsistent
state: storage directory does not exist or is not accessible.

Please check this one's availability, permissions (the NN user sh= ould
be able to access it).

On Thu, Feb 28, 2013 at 1:46 PM, Mohit Vadhera
<project.linux.proj@gmail.com> wrote:
> Please find below logs for shutting down the namenode service. Can any= body
> check this
>
> 2013-02-28 02:07:51,752 WARN org.apache.hadoop.hdfs.server.common.Util= : Path
> /mnt/san1/hdfs/cache/hdfs/dfs/name should be specified as a URI in
> configuration files. Please update hdfs configuration.
> 2013-02-28 02:07:51,754 WARN org.apache.hadoop.hdfs.server.common.Util= : Path
> /mnt/san1/hdfs/cache/hdfs/dfs/name should be specified as a URI in
> configuration files. Please update hdfs configuration.
> 2013-02-28 02:07:51,754 WARN
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Only one image st= orage
> directory (dfs.namenode.name.dir) configured. Beware of dataloss due t= o lack
> of redundant storage directories!
> 2013-02-28 02:07:51,754 WARN
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Only one namespac= e
> edits storage directory (dfs.namenode.edits.dir) configured. Beware of=
> dataloss due to lack of redundant storage directories!
> 2013-02-28 02:07:51,884 INFO org.apache.hadoop.util.HostsFileReader: > Refreshing hosts (include/exclude) list
> 2013-02-28 02:07:51,890 INFO
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager:
> dfs.block.invalidate.limit=3D1000
> 2013-02-28 02:07:51,909 INFO
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager:
> dfs.block.access.token.enable=3Dfalse
> 2013-02-28 02:07:51,910 INFO
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager:
> defaultReplication =A0 =A0 =A0 =A0 =3D 1
> 2013-02-28 02:07:51,910 INFO
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: maxReplica= tion
> =3D 512
> 2013-02-28 02:07:51,910 INFO
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: minReplica= tion
> =3D 1
> 2013-02-28 02:07:51,910 INFO
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager:
> maxReplicationStreams =A0 =A0 =A0=3D 2
> 2013-02-28 02:07:51,910 INFO
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager:
> shouldCheckForEnoughRacks =A0=3D false
> 2013-02-28 02:07:51,910 INFO
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager:
> replicationRecheckInterval =3D 3000
> 2013-02-28 02:07:51,910 INFO
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager:
> encryptDataTransfer =A0 =A0 =A0 =A0=3D false
> 2013-02-28 02:07:51,920 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner =A0 =A0 = =A0 =A0 =A0 =A0 =3D
> hdfs (auth:SIMPLE)
> 2013-02-28 02:07:51,920 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup =A0 = =A0 =A0 =A0 =A0=3D
> hadmin
> 2013-02-28 02:07:51,920 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isPermissionEnabl= ed =3D
> true
> 2013-02-28 02:07:51,920 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: HA Enabled: false=
> 2013-02-28 02:07:51,925 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Append Enabled: t= rue
> 2013-02-28 02:07:52,462 INFO
> org.apache.hadoop.hdfs.server.namenode.NameNode: Caching file names oc= curing
> more than 10 times
> 2013-02-28 02:07:52,466 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
> dfs.namenode.safemode.threshold-pct =3D 0.9990000128746033
> 2013-02-28 02:07:52,467 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
> dfs.namenode.safemode.min.datanodes =3D 0
> 2013-02-28 02:07:52,467 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
> dfs.namenode.safemode.extension =A0 =A0 =3D 0
> 2013-02-28 02:07:52,469 INFO org.apache.hadoop.hdfs.server.common.Stor= age:
> Storage directory /mnt/san1/hdfs/cache/hdfs/dfs/name does not exist. > 2013-02-28 02:07:52,471 INFO
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping NameNode m= etrics
> system...
> 2013-02-28 02:07:52,472 INFO
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics sy= stem
> stopped.
> 2013-02-28 02:07:52,473 INFO
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics sy= stem
> shutdown complete.
> 2013-02-28 02:07:52,473 FATAL
> org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in namenode= join
> org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Dir= ectory
> /mnt/san1/hdfs/cache/hdfs/dfs/name is in an inconsistent state: storag= e
> directory does not exist or is not accessible.
> =A0 =A0 =A0 =A0at
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverStorageDirs(FSIm= age.java:295)
> =A0 =A0 =A0 =A0 at
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(F= SImage.java:201)
> =A0 =A0 =A0 =A0 at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSName= system.java:534)
> =A0 =A0 =A0 =A0 at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNam= esystem.java:424)
> =A0 =A0 =A0 =A0 at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNam= esystem.java:386)
> =A0 =A0 =A0 =A0 at
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNod= e.java:398)
> =A0 =A0 =A0 =A0 at
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.ja= va:432)
> =A0 =A0 =A0 =A0 at
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.= java:608)
> =A0 =A0 =A0 =A0 at
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.= java:589)
> =A0 =A0 =A0 =A0 at
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNod= e.java:1140)
> =A0 =A0 =A0 =A0 at
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:120= 4)
> 2013-02-28 02:08:48,908 INFO org.apache.hadoop.util.ExitUtil: Exiting = with
> status 1
> 2013-02-28 02:08:48,913 INFO
> org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
> /************************************************************
> SHUTDOWN_MSG: Shutting down NameNode at OPERA-MAST1.ny.os.local/192.168.1.3
>
>
> On Thu, Feb 28, 2013 at 1:27 PM, Mohit Vadhera
> <= project.linux.proj@gmail.com> wrote:
>>
>> Hi Guys,
>>
>> I have space on other partition. Can I change the path for cache f= iles on
>> other partition ? I have below properties . Can it resolve the iss= ue ? If i
>> change the path to other directories and restart services I get th= e below
>> error while starting the service namenode. I didn't find anyth= ing in logs so
>> far. =A0Can you please suggest something ?
>>
>> =A0 <property>
>> =A0 =A0 =A0<name>hadoop.tmp.dir</name>
>> =A0 =A0 =A0<value>/var/lib/hadoop-hdfs/cache/${user.name}</value>
>> =A0 </property>
>> =A0 <property>
>> =A0 =A0 =A0<name>dfs.namenode.name.dir</name>
>> =A0 =A0 =A0<value>/var/lib/hadoop-hdfs/cache/${user.name}/dfs/name</value>
>> =A0 </property>
>> =A0 <property>
>> =A0 =A0 =A0<name>dfs.namenode.checkpoint.dir</name> >>
>> <value>/var/lib/hadoop-hdfs/cache/${user.name}/dfs/namesecondary</value>
>> =A0 </property>
>> =A0 <property>
>>
>>
>> Service namenode is failing
>>
>> # for service in /etc/init.d/hadoop-hdfs-* ; do sudo $service stat= us; done
>> Hadoop datanode is running =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0= =A0 =A0 =A0 =A0 =A0 =A0 [ =A0OK =A0]
>> Hadoop namenode is dead and pid file exists =A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0[FAILED]
>> Hadoop secondarynamenode is running =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0 =A0 =A0[ =A0OK =A0]
>>
>> Thanks,
>>
>>
>>
>> On Wed, Jan 23, 2013 at 11:15 PM, Mohit Vadhera
>> <project.linux.proj@gmail.com> wrote:
>>>
>>>
>>> On Wed, Jan 23, 2013 at 10:41 PM, Harsh J <harsh@cloudera.com> wrote: >>>>
>>>> htt= p://NNHOST:50070/conf
>>>
>>>
>>>
>>> Harsh, I changed the value as said & restarted service NN.= For verifying
>>> i checked the http link that you gave and i saw the property t= heir but on
>>> http://NNHOS= T:50070 =A0i noticed warning( WARNING : There are 4 missing
>>> blocks. Please check the logs or run fsck in order to identify= the missing
>>> blocks.) =A0when i clicked on this =A0link i can see file name= s . Do I need to
>>> reboot the machine to run fsck on root fs/ or is there hadoop = command fsck
>>> that i can run on the running hadoop ?
>>>
>>> Thanks,
>>>
>>
>



--
Harsh J



--047d7b6da95a5e2aa504d6c6c486--