hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jian Lu <...@local.com>
Subject RE: FATAL org.apache.hadoop.mapred.JobTracker: java.net.BindException: Problem binding to .......
Date Tue, 19 Oct 2010 02:32:40 GMT
Thanks Dave!  I got the three-server cluster working now.  Master server acts as NameNode +
SecondaryNameNode + JobTracker. I still need to figure out how to configure JobTracker into
a different server due to its heavy duty on MapReduce jobs.

Learning Hadoop and HBase is like torturing myself.

Jack. 

-----Original Message-----
From: Buttler, David [mailto:buttler1@llnl.gov] 
Sent: Monday, October 18, 2010 4:08 PM
To: user@hbase.apache.org
Subject: RE: FATAL org.apache.hadoop.mapred.JobTracker: java.net.BindException: Problem binding
to .......

Sorry, I was using abbreviations to type more quickly.
SNN is the secondary name node.

My script was for example purposes only and I wouldn't suggest running that on a cluster larger
than your example unless you like typing a lot. Also, it neglected to start the hbase master.

For a 20 node cluster, you may be going beyond what you want a single zookeeper node to do.
 You may want to go up to a 3 or 5 node zk ensemble.  5 is a nice number as it allows you
to take one node down for maintenance and still be robust to a single failure.  It all depends
on how robust the cluster needs to be.

To get your feet wet,  I would suggest putting all of the master processes on a single node,
and use the rest of the nodes for the slave processes -- tt (tasktracker), dn (datanode),
rs (region server)

When you get that working, then you can start making choices about capacity vs robustness:
ie. Having dedicated nodes for different master processes like the namenode, secondary name
node, hbase master, and zookeeper ensemble.

You should also figure out a way of testing to see what the performance of your system is
in each configuration, otherwise you won't be making informed decisions.

The yahoo benchmark sounds popular, but I haven't set it up myself so I don't know how much
work it is and what bottlenecks it discovers.

The binding problem means that you have two processes that are trying to access the same port.
Use netstat to see who is using the ports now.  Run jps to see what java processes are currently
running.  One reason I suggested writing a quick and dirty script to start each process manually
is that then you know exactly what processes (and where) you have started. This should help
narrow down the problem.

Setting hbase up and getting it to work is a great learning process.  It helps to have some
background in distributed systems, but if you read all of the documentation available very
carefully, you should be able to learn almost everything you need.


Dave

-----Original Message-----
From: Jian Lu [mailto:jlu@local.com] 
Sent: Monday, October 18, 2010 3:55 PM
To: user@hbase.apache.org
Subject: RE: FATAL org.apache.hadoop.mapred.JobTracker: java.net.BindException: Problem binding
to .......

Hi Dave,

Thanks a lot for the advice! What is "SNN"?  Would two SNN cause "binding problem"?

I am trying to following this instruction:  http://BLOCKEDwww.BLOCKEDmichael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_(Multi-Node_Cluster)

"Typically one machine in the cluster is designated as the NameNode and another machine the
as JobTracker, exclusively. These are the masters. The rest of the machines in the cluster
act as both DataNode and TaskTracker. These are the slaves".


This company is ordering 20 new servers for Hadoop/HBase app.  With 20 servers, should I use
a single master server as NameNode + JobTracker + zk and the rest of 19 servers as slaves?

Can you suggest a little more details on layout?  Does the scripts below start HBase as well?

Thanks a lot!

Jack.



-----Original Message-----
From: Buttler, David [mailto:buttler1@llnl.gov] 
Sent: Monday, October 18, 2010 3:27 PM
To: user@hbase.apache.org
Subject: RE: FATAL org.apache.hadoop.mapred.JobTracker: java.net.BindException: Problem binding
to .......

This is not an hbase problem...

The master file tells hadoop where to start the secondary name node. I think it will try to
start a SNN on both 1a and 1b if you put those hosts in the file.


For such a small cluster, I would suggest a different layout:
1a: nn, jt, master, zk
1b: tt, dn, snn, regionserver
1c: tt, dn, regionserver


You might consider making your own script to start your cluster if you want to distribute
things in ways that are not normal.
Eg:
ssh 1a hadoop-daemon.sh start namenode
ssh 1a hadoop-daemon.sh start jobtracker
ssh 1a hbase-daemon.sh start zookeeper

ssh 1b hadoop-daemon.sh start tasktracker
ssh 1b hadoop-daemon.sh start datanode
ssh 1b hadoop-daemon.sh start secondarynamenode
ssh 1b hbase-daemon.sh start regionserver

ssh 1c hadoop-daemon.sh start tasktracker
ssh 1c hadoop-daemon.sh start datanode
ssh 1c hbase-daemon.sh start regionserver


You are not going to get a lot of parallelism here.  If your machines are particularly beefy
I would also consider putting a dn and at regionserver on 1a


-----Original Message-----
From: Jian Lu [mailto:jlu@local.com] 
Sent: Monday, October 18, 2010 2:37 PM
To: user@hbase.apache.org
Cc: Jian Lu
Subject: FATAL org.apache.hadoop.mapred.JobTracker: java.net.BindException: Problem binding
to .......

Hi HBasers,

Please help with my hadoop-0.21.0 cluster setup!!!!!!

This is the first time I try to set up a testing cluster using three linux servers.  I ran
the ./bin/start-all.sh, all demons started correctly except for "job tracker" on Caiss01b.
Below is the error:


2010-10-18 14:09:43,881 FATAL org.apache.hadoop.mapred.JobTracker: java.net.BindException:
Problem binding to caiss01b/172.16.2.225:54311 : Cannot assign requested address
        at org.apache.hadoop.ipc.Server.bind(Server.java:218)
        at org.apache.hadoop.ipc.Server$Listener.<init>(Server.java:289)
        at org.apache.hadoop.ipc.Server.<init>(Server.java:1443)
	  ....................


Below are my three servers:

Caiss01a (master, namenode)
Caiss01b (SecondaryNameNode, job tracker)
Caiss01c (slave, DataNode, TaskTracker)

I have set up passwordless ssh to access from each other.  I have this property in mapred-site.xml.
I have make sure port 54311 was not used by other processes.

  <property>
    <name>mapred.job.tracker</name>
    <value>hdfs://caiss01b:54311</value>
  </property>

Here is my master file on Caiss01a:

caiss01a
caiss01b

Here is my slave file on Caiss01a:

caiss01c


Please help!!!!!!!!!!!!!


Jack.






Mime
View raw message