hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kevin Peterson <kpeter...@biz360.com>
Subject Re: Code re-use?
Date Wed, 09 Sep 2009 19:57:15 GMT
On Tue, Sep 8, 2009 at 1:16 PM, Mark Kerzner <markkerzner@gmail.com> wrote:

> Hi,
> I have some code that's common between the main class, mapper, and reducer.
> Can I put it only in the main class and use it from mapper and reducer?
>
> A similar question about static variables in the main - are the available
> from mapper and reducer?
>
>
Code yes, data no.

Your mapper and reducer will have the full jar file that contains the job
(unless you are doing something very strange). You could include any code
you need to share, just as you would in any other java app.

You can't pass data in static variables though. The main class is only going
to run on the machine you submit the job from. When the mappers and reducers
start up they will start in separate JVMs not even on the same physical
node. If you need to distribute a large amount of data, you can use
distributed cache. If you just need to pass some settings, you could
accomplish it by setting child opts (options passed to the JVMs for the
mapper and reducers) in the config. If you need some sort of coordination
more complicated than this, you should look into zookeeper.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message