hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <tdunn...@maprtech.com>
Subject Re: Quick question
Date Mon, 21 Feb 2011 06:22:05 GMT
This is the most important thing that you have said. The map function
is called once per unit of input but the mapper object persists for
many input units of input.

You have a little bit of control over how many mapper objects there
are and how many machines they are created on and how many pieces your
input is broken into.  That control is limited, however, unless you
build your own input format. The standard input formats are optimized
for very large inputs and may not give you the flexibility that you
want for your experiments. That is unfortunate for the purpose of
learning about hadoop but hadoop is designed mostly for dealing with
very large data and isn't usually designed to be easy to understand.
Where easy coincides with powerful then easy is good but powerful
isn't always easy.

On Sunday, February 20, 2011, maha <maha@umail.ucsb.edu> wrote:
> So first question: is there a difference between Mappers and maps ?

View raw message