hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tarandeep Singh <tarand...@gmail.com>
Subject Re: Sharing object between mappers on same node (reuse.jvm ?)
Date Thu, 04 Jun 2009 18:16:24 GMT
Thanks Kevin for the clarification. I ran couple of tests as well and the
system behaved exactly what you had said.

So now the question is, how can I achieve what I want to do - share an
object (Lucene IndexWriter instance) between mappers running on same node. I
thought of running the IndexWriter separately outside of Hadoop and use
RMI/socket etc to communicate with it, but I am being optimistic that there
should be a simpler way than this. Any thoughts ?

Also, what if I modify the default behaviour of Hadoop to run mappers on a
node in one JVM ? (not sure if that will be possible in one first place,
just a thought)


On Thu, Jun 4, 2009 at 12:49 AM, Kevin Peterson <kpeterson@biz360.com>wrote:

> On Wed, Jun 3, 2009 at 10:59 AM, Tarandeep Singh <tarandeep@gmail.com
> >wrote:
> > I want to share a object (Lucene Index Writer Instance) between mappers
> > running on same node of 1 job (not across multiple jobs). Please correct
> me
> > if I am wrong -
> >
> > If I set the -1 for the property: mapred.job.reuse.jvm.num.tasks then all
> > mappers of one job will be executed in the same jvm and in that case if I
> > create a static Lucene Index Writer instance in my mapper class, all
> > mappers
> > running on the same node will be able to use it.
> >
> Not quite. The JVM reuse controls whether the JVM will be terminated after
> a
> single mapper run and a new one created for the next. It doesn't influence
> how many JVMs are created -- you will still get one jvm per mapper or
> reducer.
> I think there is, or was, or maybe a patch enables, what you are asking
> for,

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message