hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: HDFS Namenode Format Question.
Date Sat, 15 Sep 2012 02:08:46 GMT
Jason,

So far you've made sure your HDFS data is securely placed that it
doesn't get wiped. This much is sufficient for going ahead with
running HBase.

For the rest of the files that are going to /tmp, you will need to
tweak the config of "hadoop.tmp.dir" to make it not do so, and also
change HADOOP_OPTS in hadoop-env.sh to include a
-Djava.io.tmpdir=$HOME/tmp to move the temporary file requests off to
a new path under your $HOME.

However, doing this is not absolutely necessary. For HDFS, all that
really matters is the name and data directories, which you have
already moved to a persistent zone.

On Sat, Sep 15, 2012 at 12:33 AM, Jason Huang <jason.huang@icare.com> wrote:
> Thanks.
>
> This makes sense - checking hdfs-default.xml found the same property
> named dfs.name.dir and dfs.data.dir.
>
> Now I am no longer formatting the default tmp folders taken from
> hdfs-default.xml.
>
> However, after formatting the name node, hadoop automatically created
> another folder:
> /tmp/hsperfdata_jasonhuang
>
> Does anyone know what that directory is for?
>
> And after I started hadoop (running ./start-all.sh), another folder
> /tmp/hadoop-jasonhuang was created, together with a few files:
> /tmp/hadoop-jasonhuang-datanode.pid
> /tmp/hadoop-jasonhuang-jobtracker.pid
> /tmp/hadoop-jasonhuang-namenode.pid
> /tmp/hadoop-jasonhuang-secondarynamenode.pid
> /tmp/hadoop-jasonhuang-tasktracker.pid
>
> Are those files generated at the correct location?
>
> I've looked at the logs for both name node and master node and there
> seemed to be no error. However, I am not sure if these files are
> generated at the correct place or not. I am installing HBase on top of
> this and want to make sure Hadoop is working correctly before going
> further.
>
> thanks!
>
> Jason
>
> On Fri, Sep 14, 2012 at 1:36 PM, Harsh J <harsh@cloudera.com> wrote:
>> If you are using 1.0.3, then the config names are wrong. You need
>> dfs.name.dir and dfs.data.dir instead. Those configs you have are for
>> 2.x based releases.
>>
>> Also, I'd make that look like ${user.home}/hdfs/name, etc. for a slightly more
>> portable/templatey config :)
>>
>> On Fri, Sep 14, 2012 at 8:31 PM, Jason Huang <jason.huang@icare.com> wrote:
>>> Hello,
>>>
>>> I am trying to set up Hadoop 1.0.3 in my Macbook Pro in a
>>> pseudo-distributed mode.
>>>
>>> After download / install / setup config files I ran the following
>>> namenode format command as suggested in the user guide:
>>>
>>> $bin/hadoop namenode -format
>>>
>>> Here is the output:
>>> ************************************************************/
>>> 12/09/14 10:46:42 INFO util.GSet: VM type       = 32-bit
>>> 12/09/14 10:46:42 INFO util.GSet: 2% max memory = 39.6925 MB
>>> 12/09/14 10:46:42 INFO util.GSet: capacity      = 2^23 = 8388608 entries
>>> 12/09/14 10:46:42 INFO util.GSet: recommended=8388608, actual=8388608
>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: fsOwner=jasonhuang
>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: supergroup=supergroup
>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: isPermissionEnabled=true
>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem:
>>> isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s),
>>> accessTokenLifetime=0 min(s)
>>> 12/09/14 10:46:42 INFO namenode.NameNode: Caching file names occuring
>>> more than 10 times
>>> 12/09/14 10:46:42 INFO common.Storage: Image file of size 116 saved in
>>> 0 seconds.
>>> 12/09/14 10:46:42 INFO common.Storage: Storage directory
>>> /tmp/hadoop-jasonhuang/dfs/name has been successfully formatted.
>>> 12/09/14 10:46:42 INFO namenode.NameNode: SHUTDOWN_MSG:
>>> /************************************************************
>>>
>>> It appears that the storage directory is /tmp/hadoop-jasonhuang/dfs/name
>>>
>>> However, in my config file I've assigned a different directory (see
>>> hdfs-site.xml below):
>>> <configuration>
>>>   <property>
>>>      <name>dfs.replication</name>
>>>      <value>1</value>
>>>   </property>
>>>   <property>
>>>      <name>dfs.namenode.name.dir</name>
>>>      <value>/Users/jasonhuang/hdfs/name</value>
>>>   </property>
>>>   <property>
>>>      <name>dfs.datanode.data.dir</name>
>>>      <value>/Users/jasonhuang/hdfs/data</value>
>>>   </property>
>>>
>>> Does anyone know why the hdfs-site.xml might not be respected?
>>>
>>> Also, after formatting the name node, I did a search for the fsimage
>>> file in my local file directories (from root dir) and here is what I
>>> found:
>>> $ sudo find / -name fsimage
>>> /private/tmp/hadoop-jasonhuang/dfs/name/current/fsimage
>>> /private/tmp/hadoop-jasonhuang/dfs/name/image/fsimage
>>>
>>> I don't understand why the name node format picked (and created) these
>>> two directories...
>>>
>>> Any thoughts?
>>>
>>> Thanks!
>>>
>>> Jason
>>
>>
>>
>> --
>> Harsh J



-- 
Harsh J

Mime
View raw message