hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mohit Vadhera <project.linux.p...@gmail.com>
Subject Re: NameNode low on available disk space
Date Thu, 28 Feb 2013 10:17:08 GMT
After creating the directory and setting permission I tried to restart the
services and i get error "/mnt/san1/hdfs/cache/hdfs/dfs/name/in_use.lock
acquired by nodename 7275@OPERA-MAST1.ny.os.local" and services are not
being started.

Need to check few logs from below logs.
===================================
2013-02-28 05:06:24,905 WARN org.apache.hadoop.hdfs.server.common.Util:
Path /mnt/san1/hdfs/cache/hdfs/dfs/name should be specified as a URI in
configuration files. Please update hdfs configuration.
2013-02-28 05:06:24,905 WARN org.apache.hadoop.hdfs.server.common.Util:
Path /mnt/san1/hdfs/cache/hdfs/dfs/name should be specified as a URI in
configuration files. Please update hdfs configuration.
2013-02-28 05:06:24,906 WARN
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Only one image storage
directory (dfs.namenode.name.dir) configured. Beware of dataloss due to
lack of redundant sto
rage directories!
2013-02-28 05:06:24,906 WARN
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Only one namespace
edits storage directory (dfs.namenode.edits.dir) configured. Beware of
dataloss due to lack of re
dundant storage directories!


************************************************************/
2013-02-28 05:06:23,385 WARN org.apache.hadoop.metrics2.impl.MetricsConfig:
Cannot locate configuration: tried
hadoop-metrics2-namenode.properties,hadoop-metrics2.properties
2013-02-28 05:06:23,556 INFO
org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot
period at 10 second(s).
2013-02-28 05:06:23,556 INFO
org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system
started
2013-02-28 05:06:24,905 WARN org.apache.hadoop.hdfs.server.common.Util:
Path /mnt/san1/hdfs/cache/hdfs/dfs/name should be specified as a URI in
configuration files. Please update hdfs configuration.
2013-02-28 05:06:24,905 WARN org.apache.hadoop.hdfs.server.common.Util:
Path /mnt/san1/hdfs/cache/hdfs/dfs/name should be specified as a URI in
configuration files. Please update hdfs configuration.
2013-02-28 05:06:24,906 WARN
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Only one image storage
directory (dfs.namenode.name.dir) configured. Beware of dataloss due to
lack of redundant sto
rage directories!
2013-02-28 05:06:24,906 WARN
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Only one namespace
edits storage directory (dfs.namenode.edits.dir) configured. Beware of
dataloss due to lack of re
dundant storage directories!
2013-02-28 05:06:25,618 INFO org.apache.hadoop.util.HostsFileReader:
Refreshing hosts (include/exclude) list
2013-02-28 05:06:25,623 INFO
org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager:
dfs.block.invalidate.limit=1000
2013-02-28 05:06:26,015 INFO
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager:
dfs.block.access.token.enable=false
2013-02-28 05:06:26,015 INFO
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager:
defaultReplication         = 1
2013-02-28 05:06:26,015 INFO
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: maxReplication
            = 512
2013-02-28 05:06:26,015 INFO
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: minReplication
            = 1
2013-02-28 05:06:26,015 INFO
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager:
maxReplicationStreams      = 2
2013-02-28 05:06:26,016 INFO
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager:
shouldCheckForEnoughRacks  = false
2013-02-28 05:06:26,016 INFO
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager:
replicationRecheckInterval = 3000
2013-02-28 05:06:26,016 INFO
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager:
encryptDataTransfer        = false
2013-02-28 05:06:26,022 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner             =
hdfs (auth:SIMPLE)
2013-02-28 05:06:26,022 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup          =
hadmin
2013-02-28 05:06:26,022 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isPermissionEnabled =
true
2013-02-28 05:06:26,023 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: HA Enabled: false
2013-02-28 05:06:26,026 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Append Enabled: true
2013-02-28 05:06:26,359 INFO
org.apache.hadoop.hdfs.server.namenode.NameNode: Caching file names
occuring more than 10 times
2013-02-28 05:06:26,361 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
dfs.namenode.safemode.threshold-pct = 0.9990000128746033
2013-02-28 05:06:26,361 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
dfs.namenode.safemode.min.datanodes = 0
2013-02-28 05:06:26,361 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
dfs.namenode.safemode.extension     = 0
2013-02-28 05:06:26,378 INFO org.apache.hadoop.hdfs.server.common.Storage:
Lock on /mnt/san1/hdfs/cache/hdfs/dfs/name/in_use.lock acquired by nodename
7275@OPERA-MAST1.ny.os.local
2013-02-28 05:06:26,381 INFO
org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping NameNode
metrics system...
2013-02-28 05:06:26,381 INFO
org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system
stopped.
2013-02-28 05:06:26,381 INFO
org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system
shutdown complete.
2013-02-28 05:06:26,382 FATAL
org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in namenode join
java.io.IOException: NameNode is not formatted.
        at
org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:211)
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:534)
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:424)
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:386)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:398)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:432)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:608)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:589)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1140)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1204)
2013-02-28 05:06:26,385 INFO org.apache.hadoop.util.ExitUtil: Exiting with
status 1
2013-02-28 05:06:26,394 INFO
org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at localtest/192.168.1.3



On Thu, Feb 28, 2013 at 3:18 PM, Mohit Vadhera <project.linux.proj@gmail.com
> wrote:

> Thanks Harsh,  /mnt/san1/hdfs/cache/hdfs/dfs/name is not being created .
> If I do compare with the older path the permissions are same on the parent
> directories.
> Do I need to create this this directory manually and set the permission ?
>
> Older Path
>
> # ll /var/lib/hadoop-hdfs/cache/hdfs/
> total 4
> drwxr-xr-x. 5 hdfs hdfs 4096 Dec 27 11:34 dfs
>
> # ll /var/lib/hadoop-hdfs/cache/hdfs/dfs/
> total 12
> drwx------. 3 hdfs hdfs 4096 Dec 19 02:37 data
> drwxr-xr-x. 3 hdfs hdfs 4096 Feb 28 02:36 name
> drwxr-xr-x. 3 hdfs hdfs 4096 Feb 28 02:36 namesecondary
>
>
> New Path
>
> # ll /mnt/san1/hdfs/cache/hdfs/
> total 4
> drwxr-xr-x 3 hdfs hdfs 4096 Feb 28 02:08 dfs
>
>
> # ll /mnt/san1/hdfs/cache/hdfs/dfs/
> total 4
> drwxr-xr-x 2 hdfs hdfs 4096 Feb 28 02:36 namesecondary
>
>
> Thanks,
>
>
>
> On Thu, Feb 28, 2013 at 1:59 PM, Harsh J <harsh@cloudera.com> wrote:
>
>> Hi,
>>
>> The exact error is displayed on your log and should be somewhat self
>> explanatory:
>>
>> org.apache.hadoop.hdfs.server.common.InconsistentFSStateException:
>> Directory /mnt/san1/hdfs/cache/hdfs/dfs/name is in an inconsistent
>> state: storage directory does not exist or is not accessible.
>>
>> Please check this one's availability, permissions (the NN user should
>> be able to access it).
>>
>> On Thu, Feb 28, 2013 at 1:46 PM, Mohit Vadhera
>> <project.linux.proj@gmail.com> wrote:
>> > Please find below logs for shutting down the namenode service. Can
>> anybody
>> > check this
>> >
>> > 2013-02-28 02:07:51,752 WARN org.apache.hadoop.hdfs.server.common.Util:
>> Path
>> > /mnt/san1/hdfs/cache/hdfs/dfs/name should be specified as a URI in
>> > configuration files. Please update hdfs configuration.
>> > 2013-02-28 02:07:51,754 WARN org.apache.hadoop.hdfs.server.common.Util:
>> Path
>> > /mnt/san1/hdfs/cache/hdfs/dfs/name should be specified as a URI in
>> > configuration files. Please update hdfs configuration.
>> > 2013-02-28 02:07:51,754 WARN
>> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Only one image
>> storage
>> > directory (dfs.namenode.name.dir) configured. Beware of dataloss due to
>> lack
>> > of redundant storage directories!
>> > 2013-02-28 02:07:51,754 WARN
>> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Only one namespace
>> > edits storage directory (dfs.namenode.edits.dir) configured. Beware of
>> > dataloss due to lack of redundant storage directories!
>> > 2013-02-28 02:07:51,884 INFO org.apache.hadoop.util.HostsFileReader:
>> > Refreshing hosts (include/exclude) list
>> > 2013-02-28 02:07:51,890 INFO
>> > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager:
>> > dfs.block.invalidate.limit=1000
>> > 2013-02-28 02:07:51,909 INFO
>> > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager:
>> > dfs.block.access.token.enable=false
>> > 2013-02-28 02:07:51,910 INFO
>> > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager:
>> > defaultReplication         = 1
>> > 2013-02-28 02:07:51,910 INFO
>> > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager:
>> maxReplication
>> > = 512
>> > 2013-02-28 02:07:51,910 INFO
>> > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager:
>> minReplication
>> > = 1
>> > 2013-02-28 02:07:51,910 INFO
>> > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager:
>> > maxReplicationStreams      = 2
>> > 2013-02-28 02:07:51,910 INFO
>> > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager:
>> > shouldCheckForEnoughRacks  = false
>> > 2013-02-28 02:07:51,910 INFO
>> > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager:
>> > replicationRecheckInterval = 3000
>> > 2013-02-28 02:07:51,910 INFO
>> > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager:
>> > encryptDataTransfer        = false
>> > 2013-02-28 02:07:51,920 INFO
>> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner
>>   =
>> > hdfs (auth:SIMPLE)
>> > 2013-02-28 02:07:51,920 INFO
>> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup
>>  =
>> > hadmin
>> > 2013-02-28 02:07:51,920 INFO
>> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>> isPermissionEnabled =
>> > true
>> > 2013-02-28 02:07:51,920 INFO
>> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: HA Enabled: false
>> > 2013-02-28 02:07:51,925 INFO
>> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Append Enabled:
>> true
>> > 2013-02-28 02:07:52,462 INFO
>> > org.apache.hadoop.hdfs.server.namenode.NameNode: Caching file names
>> occuring
>> > more than 10 times
>> > 2013-02-28 02:07:52,466 INFO
>> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>> > dfs.namenode.safemode.threshold-pct = 0.9990000128746033
>> > 2013-02-28 02:07:52,467 INFO
>> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>> > dfs.namenode.safemode.min.datanodes = 0
>> > 2013-02-28 02:07:52,467 INFO
>> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
>> > dfs.namenode.safemode.extension     = 0
>> > 2013-02-28 02:07:52,469 INFO
>> org.apache.hadoop.hdfs.server.common.Storage:
>> > Storage directory /mnt/san1/hdfs/cache/hdfs/dfs/name does not exist.
>> > 2013-02-28 02:07:52,471 INFO
>> > org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping NameNode
>> metrics
>> > system...
>> > 2013-02-28 02:07:52,472 INFO
>> > org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics
>> system
>> > stopped.
>> > 2013-02-28 02:07:52,473 INFO
>> > org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics
>> system
>> > shutdown complete.
>> > 2013-02-28 02:07:52,473 FATAL
>> > org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in namenode
>> join
>> > org.apache.hadoop.hdfs.server.common.InconsistentFSStateException:
>> Directory
>> > /mnt/san1/hdfs/cache/hdfs/dfs/name is in an inconsistent state: storage
>> > directory does not exist or is not accessible.
>> >        at
>> >
>> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverStorageDirs(FSImage.java:295)
>> >         at
>> >
>> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:201)
>> >         at
>> >
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:534)
>> >         at
>> >
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:424)
>> >         at
>> >
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:386)
>> >         at
>> >
>> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:398)
>> >         at
>> >
>> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:432)
>> >         at
>> >
>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:608)
>> >         at
>> >
>> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:589)
>> >         at
>> >
>> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1140)
>> >         at
>> > org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1204)
>> > 2013-02-28 02:08:48,908 INFO org.apache.hadoop.util.ExitUtil: Exiting
>> with
>> > status 1
>> > 2013-02-28 02:08:48,913 INFO
>> > org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
>> > /************************************************************
>> > SHUTDOWN_MSG: Shutting down NameNode at OPERA-MAST1.ny.os.local/
>> 192.168.1.3
>> >
>> >
>> > On Thu, Feb 28, 2013 at 1:27 PM, Mohit Vadhera
>> > <project.linux.proj@gmail.com> wrote:
>> >>
>> >> Hi Guys,
>> >>
>> >> I have space on other partition. Can I change the path for cache files
>> on
>> >> other partition ? I have below properties . Can it resolve the issue ?
>> If i
>> >> change the path to other directories and restart services I get the
>> below
>> >> error while starting the service namenode. I didn't find anything in
>> logs so
>> >> far.  Can you please suggest something ?
>> >>
>> >>   <property>
>> >>      <name>hadoop.tmp.dir</name>
>> >>      <value>/var/lib/hadoop-hdfs/cache/${user.name}</value>
>> >>   </property>
>> >>   <property>
>> >>      <name>dfs.namenode.name.dir</name>
>> >>      <value>/var/lib/hadoop-hdfs/cache/${user.name}/dfs/name</value>
>> >>   </property>
>> >>   <property>
>> >>      <name>dfs.namenode.checkpoint.dir</name>
>> >>
>> >> <value>/var/lib/hadoop-hdfs/cache/${user.name
>> }/dfs/namesecondary</value>
>> >>   </property>
>> >>   <property>
>> >>
>> >>
>> >> Service namenode is failing
>> >>
>> >> # for service in /etc/init.d/hadoop-hdfs-* ; do sudo $service status;
>> done
>> >> Hadoop datanode is running                                 [  OK  ]
>> >> Hadoop namenode is dead and pid file exists                [FAILED]
>> >> Hadoop secondarynamenode is running                        [  OK  ]
>> >>
>> >> Thanks,
>> >>
>> >>
>> >>
>> >> On Wed, Jan 23, 2013 at 11:15 PM, Mohit Vadhera
>> >> <project.linux.proj@gmail.com> wrote:
>> >>>
>> >>>
>> >>> On Wed, Jan 23, 2013 at 10:41 PM, Harsh J <harsh@cloudera.com>
wrote:
>> >>>>
>> >>>> http://NNHOST:50070/conf
>> >>>
>> >>>
>> >>>
>> >>> Harsh, I changed the value as said & restarted service NN. For
>> verifying
>> >>> i checked the http link that you gave and i saw the property their
>> but on
>> >>> http://NNHOST:50070  i noticed warning( WARNING : There are 4 missing
>> >>> blocks. Please check the logs or run fsck in order to identify the
>> missing
>> >>> blocks.)  when i clicked on this  link i can see file names . Do I
>> need to
>> >>> reboot the machine to run fsck on root fs/ or is there hadoop command
>> fsck
>> >>> that i can run on the running hadoop ?
>> >>>
>> >>> Thanks,
>> >>>
>> >>
>> >
>>
>>
>>
>> --
>> Harsh J
>>
>
>

Mime
View raw message