hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mithila Nagendra <mnage...@asu.edu>
Subject Re: Map-Reduce Slow Down
Date Thu, 16 Apr 2009 20:28:47 GMT
Jason: the kickstart script - was it something you wrote or is it run when
the system turns on?
Mithila

On Thu, Apr 16, 2009 at 1:06 AM, Mithila Nagendra <mnagendr@asu.edu> wrote:

> Thanks Jason! Will check that out.
> Mithila
>
>
> On Thu, Apr 16, 2009 at 5:23 AM, jason hadoop <jason.hadoop@gmail.com>wrote:
>
>> Double check that there is no firewall in place.
>> At one point a bunch of new machines were kickstarted and placed in a
>> cluster and they all failed with something similar.
>> It turned out the kickstart script turned enabled the firewall with a rule
>> that blocked ports in the 50k range.
>> It took us a while to even think to check that was not a part of our
>> normal
>> machine configuration
>>
>> On Wed, Apr 15, 2009 at 11:04 AM, Mithila Nagendra <mnagendr@asu.edu>
>> wrote:
>>
>> > Hi Aaron
>> > I will look into that thanks!
>> >
>> > I spoke to the admin who overlooks the cluster. He said that the gateway
>> > comes in to the picture only when one of the nodes communicates with a
>> node
>> > outside of the cluster. But in my case the communication is carried out
>> > between the nodes which all belong to the same cluster.
>> >
>> > Mithila
>> >
>> > On Wed, Apr 15, 2009 at 8:59 PM, Aaron Kimball <aaron@cloudera.com>
>> wrote:
>> >
>> > > Hi,
>> > >
>> > > I wrote a blog post a while back about connecting nodes via a gateway.
>> > See
>> > >
>> >
>> http://www.cloudera.com/blog/2008/12/03/securing-a-hadoop-cluster-through-a-gateway/
>> > >
>> > > This assumes that the client is outside the gateway and all
>> > > datanodes/namenode are inside, but the same principles apply. You'll
>> just
>> > > need to set up ssh tunnels from every datanode to the namenode.
>> > >
>> > > - Aaron
>> > >
>> > >
>> > > On Wed, Apr 15, 2009 at 10:19 AM, Ravi Phulari <
>> rphulari@yahoo-inc.com
>> > >wrote:
>> > >
>> > >> Looks like your NameNode is down .
>> > >> Verify if hadoop process are running (   jps should show you all java
>> > >> running process).
>> > >> If your hadoop process are running try restarting your hadoop process
>> .
>> > >> I guess this problem is due to your fsimage not being correct .
>> > >> You might have to format your namenode.
>> > >> Hope this helps.
>> > >>
>> > >> Thanks,
>> > >> --
>> > >> Ravi
>> > >>
>> > >>
>> > >> On 4/15/09 10:15 AM, "Mithila Nagendra" <mnagendr@asu.edu> wrote:
>> > >>
>> > >> The log file runs into thousands of line with the same message being
>> > >> displayed every time.
>> > >>
>> > >> On Wed, Apr 15, 2009 at 8:10 PM, Mithila Nagendra <mnagendr@asu.edu>
>> > >> wrote:
>> > >>
>> > >> > The log file : hadoop-mithila-datanode-node19.log.2009-04-14 has
>> the
>> > >> > following in it:
>> > >> >
>> > >> > 2009-04-14 10:08:11,499 INFO org.apache.hadoop.dfs.DataNode:
>> > >> STARTUP_MSG:
>> > >> > /************************************************************
>> > >> > STARTUP_MSG: Starting DataNode
>> > >> > STARTUP_MSG:   host = node19/127.0.0.1
>> > >> > STARTUP_MSG:   args = []
>> > >> > STARTUP_MSG:   version = 0.18.3
>> > >> > STARTUP_MSG:   build =
>> > >> > https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.18-r
>> > >> > 736250; compiled by 'ndaley' on Thu Jan 22 23:12:08 UTC 2009
>> > >> > ************************************************************/
>> > >> > 2009-04-14 10:08:12,915 INFO org.apache.hadoop.ipc.Client: Retrying
>> > >> connect
>> > >> > to server: node18/192.168.0.18:54310. Already tried 0 time(s).
>> > >> > 2009-04-14 10:08:13,925 INFO org.apache.hadoop.ipc.Client: Retrying
>> > >> connect
>> > >> > to server: node18/192.168.0.18:54310. Already tried 1 time(s).
>> > >> > 2009-04-14 10:08:14,935 INFO org.apache.hadoop.ipc.Client: Retrying
>> > >> connect
>> > >> > to server: node18/192.168.0.18:54310. Already tried 2 time(s).
>> > >> > 2009-04-14 10:08:15,945 INFO org.apache.hadoop.ipc.Client: Retrying
>> > >> connect
>> > >> > to server: node18/192.168.0.18:54310. Already tried 3 time(s).
>> > >> > 2009-04-14 10:08:16,955 INFO org.apache.hadoop.ipc.Client: Retrying
>> > >> connect
>> > >> > to server: node18/192.168.0.18:54310. Already tried 4 time(s).
>> > >> > 2009-04-14 10:08:17,965 INFO org.apache.hadoop.ipc.Client: Retrying
>> > >> connect
>> > >> > to server: node18/192.168.0.18:54310. Already tried 5 time(s).
>> > >> > 2009-04-14 10:08:18,975 INFO org.apache.hadoop.ipc.Client: Retrying
>> > >> connect
>> > >> > to server: node18/192.168.0.18:54310. Already tried 6 time(s).
>> > >> > 2009-04-14 10:08:19,985 INFO org.apache.hadoop.ipc.Client: Retrying
>> > >> connect
>> > >> > to server: node18/192.168.0.18:54310. Already tried 7 time(s).
>> > >> > 2009-04-14 10:08:20,995 INFO org.apache.hadoop.ipc.Client: Retrying
>> > >> connect
>> > >> > to server: node18/192.168.0.18:54310. Already tried 8 time(s).
>> > >> > 2009-04-14 10:08:22,005 INFO org.apache.hadoop.ipc.Client: Retrying
>> > >> connect
>> > >> > to server: node18/192.168.0.18:54310. Already tried 9 time(s).
>> > >> > 2009-04-14 10:08:22,008 INFO org.apache.hadoop.ipc.RPC: Server
at
>> > >> node18/
>> > >> > 192.168.0.18:54310 not available yet, Zzzzz...
>> > >> > 2009-04-14 10:08:24,025 INFO org.apache.hadoop.ipc.Client: Retrying
>> > >> connect
>> > >> > to server: node18/192.168.0.18:54310. Already tried 0 time(s).
>> > >> > 2009-04-14 10:08:25,035 INFO org.apache.hadoop.ipc.Client: Retrying
>> > >> connect
>> > >> > to server: node18/192.168.0.18:54310. Already tried 1 time(s).
>> > >> > 2009-04-14 10:08:26,045 INFO org.apache.hadoop.ipc.Client: Retrying
>> > >> connect
>> > >> > to server: node18/192.168.0.18:54310. Already tried 2 time(s).
>> > >> > 2009-04-14 10:08:27,055 INFO org.apache.hadoop.ipc.Client: Retrying
>> > >> connect
>> > >> > to server: node18/192.168.0.18:54310. Already tried 3 time(s).
>> > >> > 2009-04-14 10:08:28,065 INFO org.apache.hadoop.ipc.Client: Retrying
>> > >> connect
>> > >> > to server: node18/192.168.0.18:54310. Already tried 4 time(s).
>> > >> > 2009-04-14 10:08:29,075 INFO org.apache.hadoop.ipc.Client: Retrying
>> > >> connect
>> > >> > to server: node18/192.168.0.18:54310. Already tried 5 time(s).
>> > >> > 2009-04-14 10:08:30,085 INFO org.apache.hadoop.ipc.Client: Retrying
>> > >> connect
>> > >> > to server: node18/192.168.0.18:54310. Already tried 6 time(s).
>> > >> > 2009-04-14 10:08:31,095 INFO org.apache.hadoop.ipc.Client: Retrying
>> > >> connect
>> > >> > to server: node18/192.168.0.18:54310. Already tried 7 time(s).
>> > >> > 2009-04-14 10:08:32,105 INFO org.apache.hadoop.ipc.Client: Retrying
>> > >> connect
>> > >> > to server: node18/192.168.0.18:54310. Already tried 8 time(s).
>> > >> > 2009-04-14 10:08:33,115 INFO org.apache.hadoop.ipc.Client: Retrying
>> > >> connect
>> > >> > to server: node18/192.168.0.18:54310. Already tried 9 time(s).
>> > >> > 2009-04-14 10:08:33,116 INFO org.apache.hadoop.ipc.RPC: Server
at
>> > >> node18/
>> > >> > 192.168.0.18:54310 not available yet, Zzzzz...
>> > >> > 2009-04-14 10:08:35,135 INFO org.apache.hadoop.ipc.Client: Retrying
>> > >> connect
>> > >> > to server: node18/192.168.0.18:54310. Already tried 0 time(s).
>> > >> > 2009-04-14 10:08:36,145 INFO org.apache.hadoop.ipc.Client: Retrying
>> > >> connect
>> > >> > to server: node18/192.168.0.18:54310. Already tried 1 time(s).
>> > >> > 2009-04-14 10:08:37,155 INFO org.apache.hadoop.ipc.Client: Retrying
>> > >> connect
>> > >> > to server: node18/192.168.0.18:54310. Already tried 2 time(s).
>> > >> >
>> > >> >
>> > >> > Hmmm I still cant figure it out..
>> > >> >
>> > >> > Mithila
>> > >> >
>> > >> >
>> > >> > On Tue, Apr 14, 2009 at 10:22 PM, Mithila Nagendra <
>> mnagendr@asu.edu
>> > >> >wrote:
>> > >> >
>> > >> >> Also, Would the way the port is accessed change if all these
node
>> are
>> > >> >> connected through a gateway? I mean in the hadoop-site.xml
file?
>> The
>> > >> Ubuntu
>> > >> >> systems we worked with earlier didnt have a gateway.
>> > >> >> Mithila
>> > >> >>
>> > >> >> On Tue, Apr 14, 2009 at 9:48 PM, Mithila Nagendra <
>> mnagendr@asu.edu
>> > >> >wrote:
>> > >> >>
>> > >> >>> Aaron: Which log file do I look into - there are alot
of them.
>> Here
>> > s
>> > >> >>> what the error looks like:
>> > >> >>> [mithila@node19:~]$ cd hadoop
>> > >> >>> [mithila@node19:~/hadoop]$ bin/hadoop dfs -ls
>> > >> >>> 09/04/14 10:09:29 INFO ipc.Client: Retrying connect to
server:
>> > node18/
>> > >> >>> 192.168.0.18:54310. Already tried 0 time(s).
>> > >> >>> 09/04/14 10:09:30 INFO ipc.Client: Retrying connect to
server:
>> > node18/
>> > >> >>> 192.168.0.18:54310. Already tried 1 time(s).
>> > >> >>> 09/04/14 10:09:31 INFO ipc.Client: Retrying connect to
server:
>> > node18/
>> > >> >>> 192.168.0.18:54310. Already tried 2 time(s).
>> > >> >>> 09/04/14 10:09:32 INFO ipc.Client: Retrying connect to
server:
>> > node18/
>> > >> >>> 192.168.0.18:54310. Already tried 3 time(s).
>> > >> >>> 09/04/14 10:09:33 INFO ipc.Client: Retrying connect to
server:
>> > node18/
>> > >> >>> 192.168.0.18:54310. Already tried 4 time(s).
>> > >> >>> 09/04/14 10:09:34 INFO ipc.Client: Retrying connect to
server:
>> > node18/
>> > >> >>> 192.168.0.18:54310. Already tried 5 time(s).
>> > >> >>> 09/04/14 10:09:35 INFO ipc.Client: Retrying connect to
server:
>> > node18/
>> > >> >>> 192.168.0.18:54310. Already tried 6 time(s).
>> > >> >>> 09/04/14 10:09:36 INFO ipc.Client: Retrying connect to
server:
>> > node18/
>> > >> >>> 192.168.0.18:54310. Already tried 7 time(s).
>> > >> >>> 09/04/14 10:09:37 INFO ipc.Client: Retrying connect to
server:
>> > node18/
>> > >> >>> 192.168.0.18:54310. Already tried 8 time(s).
>> > >> >>> 09/04/14 10:09:38 INFO ipc.Client: Retrying connect to
server:
>> > node18/
>> > >> >>> 192.168.0.18:54310. Already tried 9 time(s).
>> > >> >>> Bad connection to FS. command aborted.
>> > >> >>>
>> > >> >>> Node19 is a slave and Node18 is the master.
>> > >> >>>
>> > >> >>> Mithila
>> > >> >>>
>> > >> >>>
>> > >> >>>
>> > >> >>> On Tue, Apr 14, 2009 at 8:53 PM, Aaron Kimball <
>> aaron@cloudera.com
>> > >> >wrote:
>> > >> >>>
>> > >> >>>> Are there any error messages in the log files on those
nodes?
>> > >> >>>> - Aaron
>> > >> >>>>
>> > >> >>>> On Tue, Apr 14, 2009 at 9:03 AM, Mithila Nagendra
<
>> > mnagendr@asu.edu>
>> > >> >>>> wrote:
>> > >> >>>>
>> > >> >>>> > I ve drawn a blank here! Can't figure out what
s wrong with
>> the
>> > >> ports.
>> > >> >>>> I
>> > >> >>>> > can
>> > >> >>>> > ssh between the nodes but cant access the DFS
from the slaves
>> -
>> > >> says
>> > >> >>>> "Bad
>> > >> >>>> > connection to DFS". Master seems to be fine.
>> > >> >>>> > Mithila
>> > >> >>>> >
>> > >> >>>> > On Tue, Apr 14, 2009 at 4:28 AM, Mithila Nagendra
<
>> > >> mnagendr@asu.edu>
>> > >> >>>> > wrote:
>> > >> >>>> >
>> > >> >>>> > > Yes I can..
>> > >> >>>> > >
>> > >> >>>> > >
>> > >> >>>> > > On Mon, Apr 13, 2009 at 5:12 PM, Jim Twensky
<
>> > >> jim.twensky@gmail.com
>> > >> >>>> > >wrote:
>> > >> >>>> > >
>> > >> >>>> > >> Can you ssh between the nodes?
>> > >> >>>> > >>
>> > >> >>>> > >> -jim
>> > >> >>>> > >>
>> > >> >>>> > >> On Mon, Apr 13, 2009 at 6:49 PM, Mithila
Nagendra <
>> > >> >>>> mnagendr@asu.edu>
>> > >> >>>> > >> wrote:
>> > >> >>>> > >>
>> > >> >>>> > >> > Thanks Aaron.
>> > >> >>>> > >> > Jim: The three clusters I setup
had ubuntu running on
>> them
>> > and
>> > >> >>>> the dfs
>> > >> >>>> > >> was
>> > >> >>>> > >> > accessed at port 54310. The new
cluster which I ve setup
>> has
>> > >> Red
>> > >> >>>> Hat
>> > >> >>>> > >> Linux
>> > >> >>>> > >> > release 7.2 (Enigma)running on
it. Now when I try to
>> access
>> > >> the
>> > >> >>>> dfs
>> > >> >>>> > from
>> > >> >>>> > >> > one
>> > >> >>>> > >> > of the slaves i get the following
response: dfs cannot be
>> > >> >>>> accessed.
>> > >> >>>> > When
>> > >> >>>> > >> I
>> > >> >>>> > >> > access the DFS throught the master
there s no problem. So
>> I
>> > >> feel
>> > >> >>>> there
>> > >> >>>> > a
>> > >> >>>> > >> > problem with the port. Any ideas?
I did check the list of
>> > >> slaves,
>> > >> >>>> it
>> > >> >>>> > >> looks
>> > >> >>>> > >> > fine to me.
>> > >> >>>> > >> >
>> > >> >>>> > >> > Mithila
>> > >> >>>> > >> >
>> > >> >>>> > >> >
>> > >> >>>> > >> >
>> > >> >>>> > >> >
>> > >> >>>> > >> > On Mon, Apr 13, 2009 at 2:58 PM,
Jim Twensky <
>> > >> >>>> jim.twensky@gmail.com>
>> > >> >>>> > >> > wrote:
>> > >> >>>> > >> >
>> > >> >>>> > >> > > Mithila,
>> > >> >>>> > >> > >
>> > >> >>>> > >> > > You said all the slaves were
being utilized in the 3
>> node
>> > >> >>>> cluster.
>> > >> >>>> > >> Which
>> > >> >>>> > >> > > application did you run to
test that and what was your
>> > input
>> > >> >>>> size?
>> > >> >>>> > If
>> > >> >>>> > >> you
>> > >> >>>> > >> > > tried the word count application
on a 516 MB input file
>> on
>> > >> both
>> > >> >>>> > >> cluster
>> > >> >>>> > >> > > setups, than some of your
nodes in the 15 node cluster
>> may
>> > >> not
>> > >> >>>> be
>> > >> >>>> > >> running
>> > >> >>>> > >> > > at
>> > >> >>>> > >> > > all. Generally, one map job
is assigned to each input
>> > split
>> > >> and
>> > >> >>>> if
>> > >> >>>> > you
>> > >> >>>> > >> > are
>> > >> >>>> > >> > > running your cluster with
the defaults, the splits are
>> 64
>> > MB
>> > >> >>>> each. I
>> > >> >>>> > >> got
>> > >> >>>> > >> > > confused when you said the
Namenode seemed to do all
>> the
>> > >> work.
>> > >> >>>> Can
>> > >> >>>> > you
>> > >> >>>> > >> > > check
>> > >> >>>> > >> > > conf/slaves and make sure
you put the names of all task
>> > >> >>>> trackers
>> > >> >>>> > >> there? I
>> > >> >>>> > >> > > also suggest comparing both
clusters with a larger
>> input
>> > >> size,
>> > >> >>>> say
>> > >> >>>> > at
>> > >> >>>> > >> > least
>> > >> >>>> > >> > > 5 GB, to really see a difference.
>> > >> >>>> > >> > >
>> > >> >>>> > >> > > Jim
>> > >> >>>> > >> > >
>> > >> >>>> > >> > > On Mon, Apr 13, 2009 at 4:17
PM, Aaron Kimball <
>> > >> >>>> aaron@cloudera.com>
>> > >> >>>> > >> > wrote:
>> > >> >>>> > >> > >
>> > >> >>>> > >> > > > in hadoop-*-examples.jar,
use "randomwriter" to
>> generate
>> > >> the
>> > >> >>>> data
>> > >> >>>> > >> and
>> > >> >>>> > >> > > > "sort"
>> > >> >>>> > >> > > > to sort it.
>> > >> >>>> > >> > > > - Aaron
>> > >> >>>> > >> > > >
>> > >> >>>> > >> > > > On Sun, Apr 12, 2009
at 9:33 PM, Pankil Doshi <
>> > >> >>>> > forpankil@gmail.com>
>> > >> >>>> > >> > > wrote:
>> > >> >>>> > >> > > >
>> > >> >>>> > >> > > > > Your data is too
small I guess for 15 clusters ..So
>> it
>> > >> >>>> might be
>> > >> >>>> > >> > > overhead
>> > >> >>>> > >> > > > > time of these clusters
making your total MR jobs
>> more
>> > >> time
>> > >> >>>> > >> consuming.
>> > >> >>>> > >> > > > > I guess you will
have to try with larger set of
>> data..
>> > >> >>>> > >> > > > >
>> > >> >>>> > >> > > > > Pankil
>> > >> >>>> > >> > > > > On Sun, Apr 12,
2009 at 6:54 PM, Mithila Nagendra <
>> > >> >>>> > >> mnagendr@asu.edu>
>> > >> >>>> > >> > > > > wrote:
>> > >> >>>> > >> > > > >
>> > >> >>>> > >> > > > > > Aaron
>> > >> >>>> > >> > > > > >
>> > >> >>>> > >> > > > > > That could
be the issue, my data is just 516MB -
>> > >> wouldn't
>> > >> >>>> this
>> > >> >>>> > >> see
>> > >> >>>> > >> > a
>> > >> >>>> > >> > > > bit
>> > >> >>>> > >> > > > > of
>> > >> >>>> > >> > > > > > speed up?
>> > >> >>>> > >> > > > > > Could you guide
me to the example? I ll run my
>> > cluster
>> > >> on
>> > >> >>>> it
>> > >> >>>> > and
>> > >> >>>> > >> > see
>> > >> >>>> > >> > > > what
>> > >> >>>> > >> > > > > I
>> > >> >>>> > >> > > > > > get. Also for
my program I had a java timer
>> running
>> > to
>> > >> >>>> record
>> > >> >>>> > >> the
>> > >> >>>> > >> > > time
>> > >> >>>> > >> > > > > > taken
>> > >> >>>> > >> > > > > > to complete
execution. Does Hadoop have an
>> inbuilt
>> > >> timer?
>> > >> >>>> > >> > > > > >
>> > >> >>>> > >> > > > > > Mithila
>> > >> >>>> > >> > > > > >
>> > >> >>>> > >> > > > > > On Mon, Apr
13, 2009 at 1:13 AM, Aaron Kimball <
>> > >> >>>> > >> aaron@cloudera.com
>> > >> >>>> > >> > >
>> > >> >>>> > >> > > > > wrote:
>> > >> >>>> > >> > > > > >
>> > >> >>>> > >> > > > > > > Virtually
none of the examples that ship with
>> > Hadoop
>> > >> >>>> are
>> > >> >>>> > >> designed
>> > >> >>>> > >> > > to
>> > >> >>>> > >> > > > > > > showcase
its speed. Hadoop's speedup comes from
>> > its
>> > >> >>>> ability
>> > >> >>>> > to
>> > >> >>>> > >> > > > process
>> > >> >>>> > >> > > > > > very
>> > >> >>>> > >> > > > > > > large
volumes of data (starting around, say,
>> tens
>> > of
>> > >> GB
>> > >> >>>> per
>> > >> >>>> > >> job,
>> > >> >>>> > >> > > and
>> > >> >>>> > >> > > > > > going
>> > >> >>>> > >> > > > > > > up in
orders of magnitude from there). So if
>> you
>> > are
>> > >> >>>> timing
>> > >> >>>> > >> the
>> > >> >>>> > >> > pi
>> > >> >>>> > >> > > > > > > calculator
(or something like that), its
>> results
>> > >> won't
>> > >> >>>> > >> > necessarily
>> > >> >>>> > >> > > be
>> > >> >>>> > >> > > > > > very
>> > >> >>>> > >> > > > > > > consistent.
If a job doesn't have enough
>> fragments
>> > >> of
>> > >> >>>> data
>> > >> >>>> > to
>> > >> >>>> > >> > > > allocate
>> > >> >>>> > >> > > > > > one
>> > >> >>>> > >> > > > > > > per each
node, some of the nodes will also just
>> go
>> > >> >>>> unused.
>> > >> >>>> > >> > > > > > >
>> > >> >>>> > >> > > > > > > The best
example for you to run is to use
>> > >> randomwriter
>> > >> >>>> to
>> > >> >>>> > fill
>> > >> >>>> > >> up
>> > >> >>>> > >> > > > your
>> > >> >>>> > >> > > > > > > cluster
with several GB of random data and then
>> > run
>> > >> the
>> > >> >>>> sort
>> > >> >>>> > >> > > program.
>> > >> >>>> > >> > > > > If
>> > >> >>>> > >> > > > > > > that doesn't
scale up performance from 3 nodes
>> to
>> > >> 15,
>> > >> >>>> then
>> > >> >>>> > >> you've
>> > >> >>>> > >> > > > > > > definitely
>> > >> >>>> > >> > > > > > > got something
strange going on.
>> > >> >>>> > >> > > > > > >
>> > >> >>>> > >> > > > > > > - Aaron
>> > >> >>>> > >> > > > > > >
>> > >> >>>> > >> > > > > > >
>> > >> >>>> > >> > > > > > > On Sun,
Apr 12, 2009 at 8:39 AM, Mithila
>> Nagendra
>> > <
>> > >> >>>> > >> > > mnagendr@asu.edu>
>> > >> >>>> > >> > > > > > > wrote:
>> > >> >>>> > >> > > > > > >
>> > >> >>>> > >> > > > > > > > Hey
all
>> > >> >>>> > >> > > > > > > > I
recently setup a three node hadoop cluster
>> and
>> > >> ran
>> > >> >>>> an
>> > >> >>>> > >> > examples
>> > >> >>>> > >> > > on
>> > >> >>>> > >> > > > > it.
>> > >> >>>> > >> > > > > > > It
>> > >> >>>> > >> > > > > > > > was
pretty fast, and all the three nodes were
>> > >> being
>> > >> >>>> used
>> > >> >>>> > (I
>> > >> >>>> > >> > > checked
>> > >> >>>> > >> > > > > the
>> > >> >>>> > >> > > > > > > log
>> > >> >>>> > >> > > > > > > > files
to make sure that the slaves are
>> > utilized).
>> > >> >>>> > >> > > > > > > >
>> > >> >>>> > >> > > > > > > > Now
I ve setup another cluster consisting of
>> 15
>> > >> >>>> nodes. I
>> > >> >>>> > ran
>> > >> >>>> > >> > the
>> > >> >>>> > >> > > > same
>> > >> >>>> > >> > > > > > > > example,
but instead of speeding up, the
>> > >> map-reduce
>> > >> >>>> task
>> > >> >>>> > >> seems
>> > >> >>>> > >> > to
>> > >> >>>> > >> > > > > take
>> > >> >>>> > >> > > > > > > > forever!
The slaves are not being used for
>> some
>> > >> >>>> reason.
>> > >> >>>> > This
>> > >> >>>> > >> > > second
>> > >> >>>> > >> > > > > > > cluster
>> > >> >>>> > >> > > > > > > > has
a lower, per node processing power, but
>> > should
>> > >> >>>> that
>> > >> >>>> > make
>> > >> >>>> > >> > any
>> > >> >>>> > >> > > > > > > > difference?
>> > >> >>>> > >> > > > > > > > How
can I ensure that the data is being
>> mapped
>> > to
>> > >> all
>> > >> >>>> the
>> > >> >>>> > >> > nodes?
>> > >> >>>> > >> > > > > > > Presently,
>> > >> >>>> > >> > > > > > > > the
only node that seems to be doing all the
>> > work
>> > >> is
>> > >> >>>> the
>> > >> >>>> > >> Master
>> > >> >>>> > >> > > > node.
>> > >> >>>> > >> > > > > > > >
>> > >> >>>> > >> > > > > > > > Does
15 nodes in a cluster increase the
>> network
>> > >> cost?
>> > >> >>>> What
>> > >> >>>> > >> can
>> > >> >>>> > >> > I
>> > >> >>>> > >> > > do
>> > >> >>>> > >> > > > > to
>> > >> >>>> > >> > > > > > > > setup
>> > >> >>>> > >> > > > > > > > the
cluster to function more efficiently?
>> > >> >>>> > >> > > > > > > >
>> > >> >>>> > >> > > > > > > > Thanks!
>> > >> >>>> > >> > > > > > > > Mithila
Nagendra
>> > >> >>>> > >> > > > > > > > Arizona
State University
>> > >> >>>> > >> > > > > > > >
>> > >> >>>> > >> > > > > > >
>> > >> >>>> > >> > > > > >
>> > >> >>>> > >> > > > >
>> > >> >>>> > >> > > >
>> > >> >>>> > >> > >
>> > >> >>>> > >> >
>> > >> >>>> > >>
>> > >> >>>> > >
>> > >> >>>> > >
>> > >> >>>> >
>> > >> >>>>
>> > >> >>>
>> > >> >>>
>> > >> >>
>> > >> >
>> > >>
>> > >>
>> > >> Ravi
>> > >> --
>> > >>
>> > >>
>> > >
>> >
>>
>>
>>
>> --
>> Alpha Chapters of my book on Hadoop are available
>> http://www.apress.com/book/view/9781430219422
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message