hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mohit Vadhera <project.linux.p...@gmail.com>
Subject Re: NameNode low on available disk space
Date Thu, 28 Feb 2013 10:58:13 GMT
 Even i created the file /mnt/san1/hdfs/cache/hdfs/dfs/name/in_use.lock and
set permission . when i restart hadoop services. It removes and I find
below logs.

Do I need to format the NN?
Below is the command to format the NN ?
Any kind of loss while formatting ?
Is there any way to avoid formatting and change the cache path ?

2013-02-28 05:57:50,902 INFO org.apache.hadoop.hdfs.server.common.Storage:
Lock on /mnt/san1/hdfs/cache/hdfs/dfs/name/in_use.lock acquired by nodename
81133@OPERA-MAST1.ny.os.local
2013-02-28 05:57:50,904 INFO
org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping NameNode
metrics system...
2013-02-28 05:57:50,904 INFO
org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system
stopped.
2013-02-28 05:57:50,904 INFO
org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system
shutdown complete.
2013-02-28 05:57:50,905 FATAL
org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in namenode join
java.io.IOException: NameNode is not formatted.

Command to format the NN.

sudo -u hdfs hdfs namenode -format

Thanks,


On Thu, Feb 28, 2013 at 3:47 PM, Mohit Vadhera <project.linux.proj@gmail.com
> wrote:

> After creating the directory and setting permission I tried to restart the
> services and i get error "/mnt/san1/hdfs/cache/hdfs/dfs/name/in_use.lock
> acquired by nodename 7275@OPERA-MAST1.ny.os.local" and services are not
> being started.
>
> Need to check few logs from below logs.
> ===================================
> 2013-02-28 05:06:24,905 WARN org.apache.hadoop.hdfs.server.common.Util:
> Path /mnt/san1/hdfs/cache/hdfs/dfs/name should be specified as a URI in
> configuration files. Please update hdfs configuration.
> 2013-02-28 05:06:24,905 WARN org.apache.hadoop.hdfs.server.common.Util:
> Path /mnt/san1/hdfs/cache/hdfs/dfs/name should be specified as a URI in
> configuration files. Please update hdfs configuration.
> 2013-02-28 05:06:24,906 WARN
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Only one image storage
> directory (dfs.namenode.name.dir) configured. Beware of dataloss due to
> lack of redundant sto
> rage directories!
> 2013-02-28 05:06:24,906 WARN
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Only one namespace
> edits storage directory (dfs.namenode.edits.dir) configured. Beware of
> dataloss due to lack of re
> dundant storage directories!
>
>
> ************************************************************/
> 2013-02-28 05:06:23,385 WARN
> org.apache.hadoop.metrics2.impl.MetricsConfig: Cannot locate configuration:
> tried hadoop-metrics2-namenode.properties,hadoop-metrics2.properties
> 2013-02-28 05:06:23,556 INFO
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot
> period at 10 second(s).
> 2013-02-28 05:06:23,556 INFO
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system
> started
> 2013-02-28 05:06:24,905 WARN org.apache.hadoop.hdfs.server.common.Util:
> Path /mnt/san1/hdfs/cache/hdfs/dfs/name should be specified as a URI in
> configuration files. Please update hdfs configuration.
> 2013-02-28 05:06:24,905 WARN org.apache.hadoop.hdfs.server.common.Util:
> Path /mnt/san1/hdfs/cache/hdfs/dfs/name should be specified as a URI in
> configuration files. Please update hdfs configuration.
> 2013-02-28 05:06:24,906 WARN
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Only one image storage
> directory (dfs.namenode.name.dir) configured. Beware of dataloss due to
> lack of redundant sto
> rage directories!
> 2013-02-28 05:06:24,906 WARN
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Only one namespace
> edits storage directory (dfs.namenode.edits.dir) configured. Beware of
> dataloss due to lack of re
> dundant storage directories!
> 2013-02-28 05:06:25,618 INFO org.apache.hadoop.util.HostsFileReader:
> Refreshing hosts (include/exclude) list
> 2013-02-28 05:06:25,623 INFO
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager:
> dfs.block.invalidate.limit=1000
> 2013-02-28 05:06:26,015 INFO
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager:
> dfs.block.access.token.enable=false
> 2013-02-28 05:06:26,015 INFO
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager:
> defaultReplication         = 1
> 2013-02-28 05:06:26,015 INFO
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: maxReplication
>             = 512
> 2013-02-28 05:06:26,015 INFO
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: minReplication
>             = 1
> 2013-02-28 05:06:26,015 INFO
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager:
> maxReplicationStreams      = 2
> 2013-02-28 05:06:26,016 INFO
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager:
> shouldCheckForEnoughRacks  = false
> 2013-02-28 05:06:26,016 INFO
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager:
> replicationRecheckInterval = 3000
> 2013-02-28 05:06:26,016 INFO
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager:
> encryptDataTransfer        = false
> 2013-02-28 05:06:26,022 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner             =
> hdfs (auth:SIMPLE)
> 2013-02-28 05:06:26,022 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup          =
> hadmin
> 2013-02-28 05:06:26,022 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isPermissionEnabled =
> true
> 2013-02-28 05:06:26,023 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: HA Enabled: false
> 2013-02-28 05:06:26,026 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Append Enabled: true
> 2013-02-28 05:06:26,359 INFO
> org.apache.hadoop.hdfs.server.namenode.NameNode: Caching file names
> occuring more than 10 times
> 2013-02-28 05:06:26,361 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
> dfs.namenode.safemode.threshold-pct = 0.9990000128746033
> 2013-02-28 05:06:26,361 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
> dfs.namenode.safemode.min.datanodes = 0
> 2013-02-28 05:06:26,361 INFO
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
> dfs.namenode.safemode.extension     = 0
> 2013-02-28 05:06:26,378 INFO org.apache.hadoop.hdfs.server.common.Storage:
> Lock on /mnt/san1/hdfs/cache/hdfs/dfs/name/in_use.lock acquired by nodename
> 7275@OPERA-MAST1.ny.os.local
> 2013-02-28 05:06:26,381 INFO
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping NameNode
> metrics system...
> 2013-02-28 05:06:26,381 INFO
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system
> stopped.
> 2013-02-28 05:06:26,381 INFO
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system
> shutdown complete.
> 2013-02-28 05:06:26,382 FATAL
> org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in namenode join
> java.io.IOException: NameNode is not formatted.
>         at
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:211)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:534)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:424)
>         at
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:386)
>         at
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:398)
>         at
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:432)
>         at
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:608)
>         at
> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:589)
>         at
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1140)
>         at
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1204)
> 2013-02-28 05:06:26,385 INFO org.apache.hadoop.util.ExitUtil: Exiting with
> status 1
> 2013-02-28 05:06:26,394 INFO
> org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
> /************************************************************
> SHUTDOWN_MSG: Shutting down NameNode at localtest/192.168.1.3
>
>
>
> On Thu, Feb 28, 2013 at 3:18 PM, Mohit Vadhera <
> project.linux.proj@gmail.com> wrote:
>
>> Thanks Harsh,  /mnt/san1/hdfs/cache/hdfs/dfs/name is not being created .
>> If I do compare with the older path the permissions are same on the parent
>> directories.
>> Do I need to create this this directory manually and set the permission ?
>>
>> Older Path
>>
>> # ll /var/lib/hadoop-hdfs/cache/hdfs/
>> total 4
>> drwxr-xr-x. 5 hdfs hdfs 4096 Dec 27 11:34 dfs
>>
>> # ll /var/lib/hadoop-hdfs/cache/hdfs/dfs/
>> total 12
>> drwx------. 3 hdfs hdfs 4096 Dec 19 02:37 data
>> drwxr-xr-x. 3 hdfs hdfs 4096 Feb 28 02:36 name
>> drwxr-xr-x. 3 hdfs hdfs 4096 Feb 28 02:36 namesecondary
>>
>>
>> New Path
>>
>> # ll /mnt/san1/hdfs/cache/hdfs/
>> total 4
>> drwxr-xr-x 3 hdfs hdfs 4096 Feb 28 02:08 dfs
>>
>>
>> # ll /mnt/san1/hdfs/cache/hdfs/dfs/
>> total 4
>> drwxr-xr-x 2 hdfs hdfs 4096 Feb 28 02:36 namesecondary
>>
>>
>> Thanks,
>>
>>
>>
>> On Thu, Feb 28, 2013 at 1:59 PM, Harsh J <harsh@cloudera.com> wrote:
>>
>>> Hi,
>>>
>>> The exact error is displayed on your log and should be somewhat self
>>> explanatory:
>>>
>>> org.apache.hadoop.hdfs.server.common.InconsistentFSStateException:
>>> Directory /mnt/san1/hdfs/cache/hdfs/dfs/name is in an inconsistent
>>> state: storage directory does not exist or is not accessible.
>>>
>>> Please check this one's availability, permissions (the NN user should
>>> be able to access it).
>>>
>>> On Thu, Feb 28, 2013 at 1:46 PM, Mohit Vadhera
>>> <project.linux.proj@gmail.com> wrote:
>>> > Please find below logs for shutting down the namenode service. Can
>>> anybody
>>> > check this
>>> >
>>> > 2013-02-28 02:07:51,752 WARN
>>> org.apache.hadoop.hdfs.server.common.Util: Path
>>> > /mnt/san1/hdfs/cache/hdfs/dfs/name should be specified as a URI in
>>> > configuration files. Please update hdfs configuration.
>>> > 2013-02-28 02:07:51,754 WARN
>>> org.apache.hadoop.hdfs.server.common.Util: Path
>>> > /mnt/san1/hdfs/cache/hdfs/dfs/name should be specified as a URI in
>>> > configuration files. Please update hdfs configuration.
>>> > 2013-02-28 02:07:51,754 WARN
>>> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Only one image
>>> storage
>>> > directory (dfs.namenode.name.dir) configured. Beware of dataloss due
>>> to lack
>>> > of redundant storage directories!
>>> > 2013-02-28 02:07:51,754 WARN
>>> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Only one namespace
>>> > edits storage directory (dfs.namenode.edits.dir) configured. Beware of
>>> > dataloss due to lack of redundant storage directories!
>>> > 2013-02-28 02:07:51,884 INFO org.apache.hadoop.util.HostsFileReader:
>>> > Refreshing hosts (include/exclude) list
>>> > 2013-02-28 02:07:51,890 INFO
>>> > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager:
>>> > dfs.block.invalidate.limit=1000
>>> > 2013-02-28 02:07:51,909 INFO
>>> > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager:
>>> > dfs.block.access.token.enable=false
>>> > 2013-02-28 02:07:51,910 INFO
>>> > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager:
>>> > defaultReplication         = 1
>>> > 2013-02-28 02:07:51,910 INFO
>>> > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager:
>>> maxReplication
>>> > = 512
>>> > 2013-02-28 02:07:51,910 INFO
>>> > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager:
>>> minReplication
>>> > = 1
>>> > 2013-02-28 02:07:51,910 INFO
>>> > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager:
>>> > maxReplicationStreams      = 2
>>> > 2013-02-28 02:07:51,910 INFO
>>> > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager:
>>> > shouldCheckForEnoughRacks  = false
>>> > 2013-02-28 02:07:51,910 INFO
>>> > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager:
>>> > replicationRecheckInterval = 3000
>>> > 2013-02-28 02:07:51,910 INFO
>>> > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager:
>>> > encryptDataTransfer        = false
>>> > 2013-02-28 02:07:51,920 INFO
>>> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner
>>>   =
>>> > hdfs (auth:SIMPLE)
>>> > 2013-02-28 02:07:51,920 INFO
>>> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup
>>>    =
>>> > hadmin
>>> > 2013-02-28 02:07:51,920 INFO
>>> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>> isPermissionEnabled =
>>> > true
>>> > 2013-02-28 02:07:51,920 INFO
>>> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: HA Enabled: false
>>> > 2013-02-28 02:07:51,925 INFO
>>> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Append Enabled:
>>> true
>>> > 2013-02-28 02:07:52,462 INFO
>>> > org.apache.hadoop.hdfs.server.namenode.NameNode: Caching file names
>>> occuring
>>> > more than 10 times
>>> > 2013-02-28 02:07:52,466 INFO
>>> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>> > dfs.namenode.safemode.threshold-pct = 0.9990000128746033
>>> > 2013-02-28 02:07:52,467 INFO
>>> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>> > dfs.namenode.safemode.min.datanodes = 0
>>> > 2013-02-28 02:07:52,467 INFO
>>> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>>> > dfs.namenode.safemode.extension     = 0
>>> > 2013-02-28 02:07:52,469 INFO
>>> org.apache.hadoop.hdfs.server.common.Storage:
>>> > Storage directory /mnt/san1/hdfs/cache/hdfs/dfs/name does not exist.
>>> > 2013-02-28 02:07:52,471 INFO
>>> > org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping NameNode
>>> metrics
>>> > system...
>>> > 2013-02-28 02:07:52,472 INFO
>>> > org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics
>>> system
>>> > stopped.
>>> > 2013-02-28 02:07:52,473 INFO
>>> > org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics
>>> system
>>> > shutdown complete.
>>> > 2013-02-28 02:07:52,473 FATAL
>>> > org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in namenode
>>> join
>>> > org.apache.hadoop.hdfs.server.common.InconsistentFSStateException:
>>> Directory
>>> > /mnt/san1/hdfs/cache/hdfs/dfs/name is in an inconsistent state: storage
>>> > directory does not exist or is not accessible.
>>> >        at
>>> >
>>> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverStorageDirs(FSImage.java:295)
>>> >         at
>>> >
>>> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:201)
>>> >         at
>>> >
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:534)
>>> >         at
>>> >
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:424)
>>> >         at
>>> >
>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:386)
>>> >         at
>>> >
>>> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:398)
>>> >         at
>>> >
>>> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:432)
>>> >         at
>>> >
>>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:608)
>>> >         at
>>> >
>>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:589)
>>> >         at
>>> >
>>> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1140)
>>> >         at
>>> >
>>> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1204)
>>> > 2013-02-28 02:08:48,908 INFO org.apache.hadoop.util.ExitUtil: Exiting
>>> with
>>> > status 1
>>> > 2013-02-28 02:08:48,913 INFO
>>> > org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
>>> > /************************************************************
>>> > SHUTDOWN_MSG: Shutting down NameNode at OPERA-MAST1.ny.os.local/
>>> 192.168.1.3
>>> >
>>> >
>>> > On Thu, Feb 28, 2013 at 1:27 PM, Mohit Vadhera
>>> > <project.linux.proj@gmail.com> wrote:
>>> >>
>>> >> Hi Guys,
>>> >>
>>> >> I have space on other partition. Can I change the path for cache
>>> files on
>>> >> other partition ? I have below properties . Can it resolve the issue
>>> ? If i
>>> >> change the path to other directories and restart services I get the
>>> below
>>> >> error while starting the service namenode. I didn't find anything in
>>> logs so
>>> >> far.  Can you please suggest something ?
>>> >>
>>> >>   <property>
>>> >>      <name>hadoop.tmp.dir</name>
>>> >>      <value>/var/lib/hadoop-hdfs/cache/${user.name}</value>
>>> >>   </property>
>>> >>   <property>
>>> >>      <name>dfs.namenode.name.dir</name>
>>> >>      <value>/var/lib/hadoop-hdfs/cache/${user.name}/dfs/name</value>
>>> >>   </property>
>>> >>   <property>
>>> >>      <name>dfs.namenode.checkpoint.dir</name>
>>> >>
>>> >> <value>/var/lib/hadoop-hdfs/cache/${user.name
>>> }/dfs/namesecondary</value>
>>> >>   </property>
>>> >>   <property>
>>> >>
>>> >>
>>> >> Service namenode is failing
>>> >>
>>> >> # for service in /etc/init.d/hadoop-hdfs-* ; do sudo $service status;
>>> done
>>> >> Hadoop datanode is running                                 [  OK  ]
>>> >> Hadoop namenode is dead and pid file exists                [FAILED]
>>> >> Hadoop secondarynamenode is running                        [  OK  ]
>>> >>
>>> >> Thanks,
>>> >>
>>> >>
>>> >>
>>> >> On Wed, Jan 23, 2013 at 11:15 PM, Mohit Vadhera
>>> >> <project.linux.proj@gmail.com> wrote:
>>> >>>
>>> >>>
>>> >>> On Wed, Jan 23, 2013 at 10:41 PM, Harsh J <harsh@cloudera.com>
>>> wrote:
>>> >>>>
>>> >>>> http://NNHOST:50070/conf
>>> >>>
>>> >>>
>>> >>>
>>> >>> Harsh, I changed the value as said & restarted service NN. For
>>> verifying
>>> >>> i checked the http link that you gave and i saw the property their
>>> but on
>>> >>> http://NNHOST:50070  i noticed warning( WARNING : There are 4
>>> missing
>>> >>> blocks. Please check the logs or run fsck in order to identify the
>>> missing
>>> >>> blocks.)  when i clicked on this  link i can see file names . Do
I
>>> need to
>>> >>> reboot the machine to run fsck on root fs/ or is there hadoop
>>> command fsck
>>> >>> that i can run on the running hadoop ?
>>> >>>
>>> >>> Thanks,
>>> >>>
>>> >>
>>> >
>>>
>>>
>>>
>>> --
>>> Harsh J
>>>
>>
>>
>

Mime
View raw message