hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter W. <pe...@marketingbrokers.com>
Subject Hadoop on Mac OSX
Date Wed, 30 May 2007 19:15:41 GMT
Hi,

There are some steps needed to get Hadoop working on
Mac OSX Tiger 10.4.x as single node, here they are:

a. use terminal (not the gui)
b. get sudo or the root account working on your machine.
c. pick an unprivileged user hadoop will run as
d. download hadoop, place in the filesystem (/opt for example)
e. set the environment variable HADOOP_CONF_DIR (point to your  
installation)
f. make sure JAVA_HOME is set
g. try and put hadoop's core jar (hadoop-0.12.3-core.jar) and
lib (/opt/hadoop-0.12.3/lib) in your CLASSPATH

h. finish configuration steps below, run the sample. if you get Apache
commons logging errors, skip hadoop CLASSPATH settings in step g.

instead, (as a kludge) use sudo or root to copy jar files to
/Library/Java Extensions:

(sudo or su first here)
cd /Library/Java/Extensions
cp /opt/hadoop-0.12.3/hadoop-0.12.3-core.jar .
cp /opt/hadoop-0.12.3/lib/*.jar .

chown these hadoop .jar files to the user described in step c.

i. move on and update a few scripts:

in /opt/hadoop-0.12.3/bin/hadoop make these changes:

JAVA=$JAVA_HOME/java
#JAVA=$JAVA_HOME/bin/java

and

JAVA_PLATFORM='MacOSX/PPC'
#  JAVA_PLATFORM=`CLASSPATH=${CLASSPATH} ${JAVA}  
org.apache.hadoop.util.PlatformName`

(this is courtesy of the nutch mailing list archives. you might also  
want to
change the above to JAVA_PLATFORM='MacOSX/Intel' if appropriate)

in /opt/hadoop-0.12.3/bin/rcc make this change:

JAVA=$JAVA_HOME/java
#JAVA=$JAVA_HOME/bin/java

j. now, it's time to set the hadoop-env.sh and hadoop-site.xml /conf  
files.

in hadoop-env.sh:
export JAVA_HOME=/System/Library/Frameworks/JavaVM.framework/Versions/ 
1.5.0/Commands
uncomment sections for heap,logs,slave and pid

in hadoop-site.xml: (values copied from Phantom's config)

<configuration>
         <property>
                 <name>fs.default.name</name>
                 <value>yourhost.yourdomain.com:9000</value>
         </property>

         <property>
                 <name>dfs.name.dir</name>
                 <value>/tmp/hadoop</value>
         </property>

         <property>
                 <name>mapred.job.tracker</name>
                 <value>yourhost.yourdomain.com:50029</value>
         </property>

         <property>
                 <name>mapred.job.tracker.info.port</name>
                 <value>50030</value>
         </property>

         <property>
                 <name>mapred.min.split.size</name>
                 <value>65536</value>
         </property>

         <property>
                 <name>dfs.replication</name>
                 <value>1</value>
         </property>
</configuration>

k. per hadoop docs, run this:
mkdir -p /tmp/hadoop-username/dfs/name

l. format the namenode:
bin/hadoop namenode -format
Formatted /tmp/hadoop-manager/dfs/name	(expected response)

m. try it out :
cd /opt/hadoop-0.12.3; mkdir indir; echo test > indir/test.txt
bin/hadoop jar hadoop-0.12.3-examples.jar wordcount indir outdir

07/05/30 10:31:02 INFO mapred.InputFormatBase: Total input paths to  
process : 1
07/05/30 10:31:02 INFO mapred.JobClient: Running job: job_qwbefe
07/05/30 10:31:02 INFO mapred.LocalJobRunner: file:/opt/hadoop-0.12.3/ 
indir/test.txt:0+5
07/05/30 10:31:02 INFO mapred.LocalJobRunner: reduce > reduce
07/05/30 10:31:03 INFO mapred.JobClient: Job complete: job_qwbefe
07/05/30 10:31:03 INFO mapred.JobClient: Counters: 8
07/05/30 10:31:03 INFO mapred.JobClient:    
org.apache.hadoop.examples.WordCount$Counter
07/05/30 10:31:03 INFO mapred.JobClient:     WORDS=1
07/05/30 10:31:03 INFO mapred.JobClient:     VALUES=2
07/05/30 10:31:03 INFO mapred.JobClient:   Map-Reduce Framework
07/05/30 10:31:03 INFO mapred.JobClient:     Map input records=2
07/05/30 10:31:03 INFO mapred.JobClient:     Map output records=1
07/05/30 10:31:03 INFO mapred.JobClient:     Map input bytes=5
07/05/30 10:31:03 INFO mapred.JobClient:     Map output bytes=9
07/05/30 10:31:03 INFO mapred.JobClient:     Reduce input records=1
07/05/30 10:31:03 INFO mapred.JobClient:     Reduce output records=1

(expected response)

Good Luck,

Peter W.

Mime
View raw message