Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hadoop.apache.org
Received-SPF: pass (athena.apache.org: local policy)
Content-Type: text/plain; charset=iso-8859-1
Mime-Version: 1.0 (Mac OS X Mail 6.0 \(1486\))
Subject: Re: What's the basic idea of pseudo-distributed Hadoop ?
From: Kai Voigt <k@123.org>
In-Reply-To: 
 <CAE636z_Dd9AkDupCyUmSZ-RuaxUvw9RU8SKCn8QVNsXkxMhRiA@mail.gmail.com>
Date: Fri, 14 Sep 2012 08:08:28 +0200
Content-Transfer-Encoding: quoted-printable
Message-Id: <5B24054F-762B-43EA-824F-9E0641B84584@123.org>
References: 
 <CAE636z_Dd9AkDupCyUmSZ-RuaxUvw9RU8SKCn8QVNsXkxMhRiA@mail.gmail.com>
To: user@hadoop.apache.org

Hello.

Am 14.09.2012 um 08:03 schrieb Jason Yang <lin.yang.jason@gmail.com>:

> I have a question about how does the pseudo-distributed Hadoop cluster =
work:
>=20
> As many map tasks are submitted to the pseudo-distributed Hadoop =
cluster, does the hadoop run each mapper in sequence ? or does it run =
these mappers in different threads or something could be parallel?

pseudo-distributed mode is a one node cluster. You have a namenode, a =
jobtracker, and a single datanode and tasktracker running. You can =
verify with "jps" command.

The default setting is that a tasktracker can run up to two map and =
reduce tasks in parallel (mapred.tasktracker.map.tasks.maximum and =
mapred.tasktracker.reduce.tasks.maximum), so you will actually see some =
concurrency on your one machine.

Kai

--=20
Kai Voigt
k@123.org