hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amogh Vasekar <am...@yahoo-inc.com>
Subject RE: about hadoop jvm allocation in job excution
Date Wed, 16 Sep 2009 05:53:14 GMT
Funny enough was looking at it just yesterday.


-----Original Message-----
From: Zhimin [mailto:wangzm@cs.umb.edu] 
Sent: Tuesday, September 15, 2009 10:53 PM
To: core-user@hadoop.apache.org
Subject: about hadoop jvm allocation in job excution

We have a project which needs to support similarity queries against items
from a huge amount of data.  One approach we have tried is to use Hbase as
the data repository and Hadoop as the query execution engine. We adopted
Hadoop because Map-Reduce is a very good model of our underlying task and
the programming was straightforward. However, we found that Hadoop will
always allocate a new JVM for each individual task on a node. This is
inefficient for us because in our case the whole Hadoop platform is
dedicated to some relatively stable  parametrized querries, and security and
strict isolation of different tasks is not our main concern. To save the
task setup time, I wonder if there are some existing mechanism in Hadoop or
some extension of Hadoop in other open source projects that can let us
reside our classes in a JVM on the job node, with task nodes waiting for

View this message in context: http://www.nabble.com/about-hadoop-jvm-allocation-in-job-excution-tp25458201p25458201.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.

View raw message