hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kumar Kandasami <kumaravel.kandas...@gmail.com>
Subject Re: Configuring Multiple Data Nodes on Pseudo-distributed mode ?
Date Sat, 11 Jun 2011 06:44:45 GMT
Thank you Harsh.

Perfect, worked as expected. :)

Kumar    _/|\_
www.saisk.com
kumar@saisk.com
"making a profound difference with knowledge and creativity..."


On Sat, Jun 11, 2011 at 12:48 AM, Harsh J <harsh@cloudera.com> wrote:

> Kumar,
>
> Your config seems alright. That post described it for 0.21/trunk
> scripts I believe. On 0.20.x based release, like CDH3, you can also
> simply use the hadoop-daemon.sh to do it. Just have to mess with some
> PID files.
>
> Here's how I do it on my Mac to start 3 DNs:
>
> $ ls conf*
> conf conf.1 conf.2
> $ hadoop-daemon.sh start datanode # Default
> $ rm pids/hadoop-harsh-datanode.pid
> $ hadoop-daemon.sh --config conf.1 start datanode # conf.1 DN
> $ rm pids/hadoop-harsh-datanode.pid
> $ hadoop-daemon.sh --config conf.2 start datanode # conf.2 DN
>
> To kill any DN, jps/ps and find out which one it is you want to and
> kill the PID displayed.
>
> On Sat, Jun 11, 2011 at 5:34 AM, Kumar Kandasami
> <kumaravel.kandasami@gmail.com> wrote:
> > Thank you Harsh.
> >
> > I have been following the documentation in the mailing list, and have an
> > issue starting the second data node (because of port conflict).
> >
> > - First, I don't see the bin/hdfs  in the directory ( I am using Mac m/c
> and
> > installed hadoop using CDH3 tarball)
> > - I am using the following command instead of the one mentioned in the
> step
> > #3 in the mailing list.
> >
> > ./bin/hadoop-daemon.sh --config ../conf2 start datanode
> >
> > Error: datanode running as process 5981. Stop it first.
> >
> > - Port configuration in the hdfs-site.xml below.
> >
> > Data Node #1: Conf file
> >
> > <property>
> >    <name>dfs.replication</name>
> >    <value>1</value>
> >  </property>
> >  <property>
> >     <name>dfs.permissions</name>
> >     <value>false</value>
> >  </property>
> >
> >  <property>
> >     <!-- specify this so that running 'hadoop namenode -format' formats
> the
> > right dir -->
> >     <name>dfs.name.dir</name>
> >     <value>/var/lib/hadoop-0.20/cache/hadoop/dfs/name</value>
> >  </property>
> >
> >  <property>
> >     <name>dfs.data.dir</name>
> >     <value>/var/lib/hadoop-0.20/cache/hadoop/dfs/data</value>
> >  </property>
> >  <property>
> >    <name>dfs.datanode.address</name>
> >    <value>0.0.0.0:50010</value>
> >  </property>
> >
> >  <property>
> >    <name>dfs.datanode.ipc.address</name>
> >    <value>0.0.0.0:50020</value>
> >    <description>
> >      The datanode ipc server address and port.
> >      If the port is 0 then the server will start on a free port.
> >    </description>
> >  </property>
> >
> >  <property>
> >    <name>dfs.datanode.http.address</name>
> >    <value>0.0.0.0:50075</value>
> >  </property>
> >
> >  <property>
> >    <name>dfs.datanode.https.address</name>
> >    <value>0.0.0.0:50475</value>
> >  </property>
> > </configuration>
> >
> > Data Node #2: Conf (2) file
> >
> > <property>
> >    <name>dfs.replication</name>
> >    <value>1</value>
> >  </property>
> >  <property>
> >     <name>dfs.permissions</name>
> >     <value>false</value>
> >  </property>
> >
> >  <property>
> >     <!-- specify this so that running 'hadoop namenode -format' formats
> the
> > right dir -->
> >     <name>dfs.name.dir</name>
> >     <value>/var/lib/hadoop-0.20/cache/hadoop/dfs/name</value>
> >  </property>
> >
> >  <property>
> >     <name>dfs.data.dir</name>
> >     <value>/var/lib/hadoop-0.20/cache/hadoop/dfs/data2</value>
> >  </property>
> >
> >  <property>
> >    <name>dfs.datanode.address</name>
> >    <value>0.0.0.0:50012</value>
> >  </property>
> >
> >  <property>
> >    <name>dfs.datanode.ipc.address</name>
> >    <value>0.0.0.0:50022</value>
> >    <description>
> >      The datanode ipc server address and port.
> >      If the port is 0 then the server will start on a free port.
> >    </description>
> >  </property>
> >
> >  <property>
> >    <name>dfs.datanode.http.address</name>
> >    <value>0.0.0.0:50077</value>
> >  </property>
> >
> >  <property>
> >    <name>dfs.datanode.https.address</name>
> >    <value>0.0.0.0:50477</value>
> >  </property>
> >
> >
> >
> > Kumar    _/|\_
> > www.saisk.com
> > kumar@saisk.com
> > "making a profound difference with knowledge and creativity..."
> >
> >
> > On Fri, Jun 10, 2011 at 12:20 AM, Harsh J <harsh@cloudera.com> wrote:
> >
> >> Try using search-hadoop.com, its pretty kick-ass.
> >>
> >> Here's what you're seeking (Matt's reply in particular):
> >>
> >>
> http://search-hadoop.com/m/sApJY1zWgQV/multiple+datanodes&subj=Multiple+DataNodes+on+a+single+machine
> >>
> >> On Fri, Jun 10, 2011 at 9:04 AM, Kumar Kandasami
> >> <kumaravel.kandasami@gmail.com> wrote:
> >> > Environment: Mac 10.6.x.  Hadoop version: hadoop-0.20.2-cdh3u0
> >> >
> >> > Is there any good reference/link that provides configuration of
> >> additional
> >> > data-nodes on a single machine (in pseudo distributed mode).
> >> >
> >> >
> >> > Thanks for the support.
> >> >
> >> >
> >> > Kumar    _/|\_
> >> > www.saisk.com
> >> > kumar@saisk.com
> >> > "making a profound difference with knowledge and creativity..."
> >> >
> >>
> >>
> >>
> >> --
> >> Harsh J
> >>
> >
>
>
>
> --
> Harsh J
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message