hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alan Ho <karlu...@yahoo.ca>
Subject Re: Sharing an object across mappers
Date Fri, 03 Oct 2008 04:49:27 GMT
It really depends on what type of data you are sharing, how you are looking up the data, whether
the data is Read-write, and whether you care about consistency. If you don't care about consistency,
I suggest that you shove the data into a BDB store (for key-value lookup) or a lucene store,
and copy the data to all the nodes. That way all data access will be in-process, no gc problems,
and you will get very fast results. BDB and lucene both have easy replication strategies.

If the data is RW, and you need consistency, you should probably forget about MapReduce and
just run everything on big-iron.

Alan Ho

----- Original Message ----
From: Devajyoti Sarkar <dsarkar@q-kk.com>
To: core-user@hadoop.apache.org
Sent: Thursday, October 2, 2008 8:41:04 PM
Subject: Sharing an object across mappers

I think each mapper/reducer runs in its own JVM which makes it impossible to
share objects. I need to share a large object so that I can access it at
memory speeds across all the mappers. Is it possible to have all the mappers
run in the same VM? Or is there a way to do this across VMs at high speed? I
guess JMI and others such methods will be just too slow.


Instant Messaging, free SMS, sharing photos and more... Try the new Yahoo! Canada Messenger
at http://ca.beta.messenger.yahoo.com/

View raw message