hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joydeep Sen Sarma <jssa...@facebook.com>
Subject RE: how jdbm is used in map join
Date Thu, 12 Aug 2010 07:11:49 GMT
i believe each mapper makes a copy since it reads in the data to be loaded into the dbm.

this needs to be optimized at some point (ideally we should be putting the dbm in distributed
cache)
________________________________________
From: Gang Luo [lgpublic@yahoo.com.cn]
Sent: Tuesday, August 10, 2010 3:04 PM
To: hive-dev@hadoop.apache.org
Subject: how jdbm is used in map join

Hi all,
Hive uses JDBM for the replicate table in map join. When multiple map tasks are
running on the same node, will there be multiple copis of JDBM file generated,
or will all the map task share the same copy? If it is the later, which mapper
generates the file, and how to synchronize other mappers?

Thanks,
-Gang





Mime
View raw message