hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Kerzner <markkerz...@gmail.com>
Subject Re: Code re-use?
Date Wed, 09 Sep 2009 20:15:18 GMT
Thank you, Kevin, for a detailed explanation. I went ahead and shared both.
Since I test on my machine, it worked :) but obviously it was a fluke, and I
need to change my code for running on the cluster.
Sincerely,
Mark

On Wed, Sep 9, 2009 at 2:57 PM, Kevin Peterson <kpeterson@biz360.com> wrote:

> On Tue, Sep 8, 2009 at 1:16 PM, Mark Kerzner <markkerzner@gmail.com>
> wrote:
>
> > Hi,
> > I have some code that's common between the main class, mapper, and
> reducer.
> > Can I put it only in the main class and use it from mapper and reducer?
> >
> > A similar question about static variables in the main - are the available
> > from mapper and reducer?
> >
> >
> Code yes, data no.
>
> Your mapper and reducer will have the full jar file that contains the job
> (unless you are doing something very strange). You could include any code
> you need to share, just as you would in any other java app.
>
> You can't pass data in static variables though. The main class is only
> going
> to run on the machine you submit the job from. When the mappers and
> reducers
> start up they will start in separate JVMs not even on the same physical
> node. If you need to distribute a large amount of data, you can use
> distributed cache. If you just need to pass some settings, you could
> accomplish it by setting child opts (options passed to the JVMs for the
> mapper and reducers) in the config. If you need some sort of coordination
> more complicated than this, you should look into zookeeper.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message