hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Shanny <ssha...@tripadvisor.com>
Subject Does anyone have a working example for using MapFiles on the DistributedCache?
Date Fri, 26 Dec 2008 22:20:30 GMT
To all,

Version:  hadoop-

I have created a MapFile.

What I don't seem to be able to do is correctly place the MapFile in  
the DistributedCache and the make use of it in a map method.

I need the following info please:

1.	How and where to place the MapFile directory so that it is visible  
to the hadoop job.
2.	How to add the files to the DistributedCache.
3.	How to create a MapFile.Reader from files in the DistributedCache.

I can get this to work with a local file on a single node system  
outside of the DistributedCache but for the life of me cannot get it  
to work within a DistributedCache.

We are trying to load up key value mappings for a Data Warehouse ETL  
process.  The mapper will take an input record, lookup the keys based  
on values and emit the resulting key only record.

Happy to answer any questions to help me make this work.



View raw message