hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Huang <jason.hu...@icare.com>
Subject Re: HDFS Namenode Format Question.
Date Sat, 15 Sep 2012 17:58:13 GMT
Thanks for the clear explanation, Harsh!

Jason

On Fri, Sep 14, 2012 at 10:08 PM, Harsh J <harsh@cloudera.com> wrote:
> Jason,
>
> So far you've made sure your HDFS data is securely placed that it
> doesn't get wiped. This much is sufficient for going ahead with
> running HBase.
>
> For the rest of the files that are going to /tmp, you will need to
> tweak the config of "hadoop.tmp.dir" to make it not do so, and also
> change HADOOP_OPTS in hadoop-env.sh to include a
> -Djava.io.tmpdir=$HOME/tmp to move the temporary file requests off to
> a new path under your $HOME.
>
> However, doing this is not absolutely necessary. For HDFS, all that
> really matters is the name and data directories, which you have
> already moved to a persistent zone.
>
> On Sat, Sep 15, 2012 at 12:33 AM, Jason Huang <jason.huang@icare.com> wrote:
>> Thanks.
>>
>> This makes sense - checking hdfs-default.xml found the same property
>> named dfs.name.dir and dfs.data.dir.
>>
>> Now I am no longer formatting the default tmp folders taken from
>> hdfs-default.xml.
>>
>> However, after formatting the name node, hadoop automatically created
>> another folder:
>> /tmp/hsperfdata_jasonhuang
>>
>> Does anyone know what that directory is for?
>>
>> And after I started hadoop (running ./start-all.sh), another folder
>> /tmp/hadoop-jasonhuang was created, together with a few files:
>> /tmp/hadoop-jasonhuang-datanode.pid
>> /tmp/hadoop-jasonhuang-jobtracker.pid
>> /tmp/hadoop-jasonhuang-namenode.pid
>> /tmp/hadoop-jasonhuang-secondarynamenode.pid
>> /tmp/hadoop-jasonhuang-tasktracker.pid
>>
>> Are those files generated at the correct location?
>>
>> I've looked at the logs for both name node and master node and there
>> seemed to be no error. However, I am not sure if these files are
>> generated at the correct place or not. I am installing HBase on top of
>> this and want to make sure Hadoop is working correctly before going
>> further.
>>
>> thanks!
>>
>> Jason
>>
>> On Fri, Sep 14, 2012 at 1:36 PM, Harsh J <harsh@cloudera.com> wrote:
>>> If you are using 1.0.3, then the config names are wrong. You need
>>> dfs.name.dir and dfs.data.dir instead. Those configs you have are for
>>> 2.x based releases.
>>>
>>> Also, I'd make that look like ${user.home}/hdfs/name, etc. for a slightly more
>>> portable/templatey config :)
>>>
>>> On Fri, Sep 14, 2012 at 8:31 PM, Jason Huang <jason.huang@icare.com> wrote:
>>>> Hello,
>>>>
>>>> I am trying to set up Hadoop 1.0.3 in my Macbook Pro in a
>>>> pseudo-distributed mode.
>>>>
>>>> After download / install / setup config files I ran the following
>>>> namenode format command as suggested in the user guide:
>>>>
>>>> $bin/hadoop namenode -format
>>>>
>>>> Here is the output:
>>>> ************************************************************/
>>>> 12/09/14 10:46:42 INFO util.GSet: VM type       = 32-bit
>>>> 12/09/14 10:46:42 INFO util.GSet: 2% max memory = 39.6925 MB
>>>> 12/09/14 10:46:42 INFO util.GSet: capacity      = 2^23 = 8388608 entries
>>>> 12/09/14 10:46:42 INFO util.GSet: recommended=8388608, actual=8388608
>>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: fsOwner=jasonhuang
>>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: supergroup=supergroup
>>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: isPermissionEnabled=true
>>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
>>>> 12/09/14 10:46:42 INFO namenode.FSNamesystem:
>>>> isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s),
>>>> accessTokenLifetime=0 min(s)
>>>> 12/09/14 10:46:42 INFO namenode.NameNode: Caching file names occuring
>>>> more than 10 times
>>>> 12/09/14 10:46:42 INFO common.Storage: Image file of size 116 saved in
>>>> 0 seconds.
>>>> 12/09/14 10:46:42 INFO common.Storage: Storage directory
>>>> /tmp/hadoop-jasonhuang/dfs/name has been successfully formatted.
>>>> 12/09/14 10:46:42 INFO namenode.NameNode: SHUTDOWN_MSG:
>>>> /************************************************************
>>>>
>>>> It appears that the storage directory is /tmp/hadoop-jasonhuang/dfs/name
>>>>
>>>> However, in my config file I've assigned a different directory (see
>>>> hdfs-site.xml below):
>>>> <configuration>
>>>>   <property>
>>>>      <name>dfs.replication</name>
>>>>      <value>1</value>
>>>>   </property>
>>>>   <property>
>>>>      <name>dfs.namenode.name.dir</name>
>>>>      <value>/Users/jasonhuang/hdfs/name</value>
>>>>   </property>
>>>>   <property>
>>>>      <name>dfs.datanode.data.dir</name>
>>>>      <value>/Users/jasonhuang/hdfs/data</value>
>>>>   </property>
>>>>
>>>> Does anyone know why the hdfs-site.xml might not be respected?
>>>>
>>>> Also, after formatting the name node, I did a search for the fsimage
>>>> file in my local file directories (from root dir) and here is what I
>>>> found:
>>>> $ sudo find / -name fsimage
>>>> /private/tmp/hadoop-jasonhuang/dfs/name/current/fsimage
>>>> /private/tmp/hadoop-jasonhuang/dfs/name/image/fsimage
>>>>
>>>> I don't understand why the name node format picked (and created) these
>>>> two directories...
>>>>
>>>> Any thoughts?
>>>>
>>>> Thanks!
>>>>
>>>> Jason
>>>
>>>
>>>
>>> --
>>> Harsh J
>
>
>
> --
> Harsh J

Mime
View raw message