hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kai Voigt...@123.org>
Subject Re: What's the basic idea of pseudo-distributed Hadoop ?
Date Fri, 14 Sep 2012 06:08:28 GMT

Am 14.09.2012 um 08:03 schrieb Jason Yang <lin.yang.jason@gmail.com>:

> I have a question about how does the pseudo-distributed Hadoop cluster work:
> As many map tasks are submitted to the pseudo-distributed Hadoop cluster, does the hadoop
run each mapper in sequence ? or does it run these mappers in different threads or something
could be parallel?

pseudo-distributed mode is a one node cluster. You have a namenode, a jobtracker, and a single
datanode and tasktracker running. You can verify with "jps" command.

The default setting is that a tasktracker can run up to two map and reduce tasks in parallel
(mapred.tasktracker.map.tasks.maximum and mapred.tasktracker.reduce.tasks.maximum), so you
will actually see some concurrency on your one machine.


Kai Voigt

View raw message