hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Srinivas Chamarthi <srinivas.chamar...@gmail.com>
Subject DistributedCache
Date Fri, 12 Dec 2014 02:07:53 GMT
Hi,

I want to cache map/reducer temporary output files so that I can compare
two map results coming from two different nodes to verify the integrity
check.

I am simulating this use case with speculative execution by rescheduling
the first task as soon as it is started and running.

Now I want to compare output files coming from speculative attempt and
prior attempt so that I can calculate the credit scoring of each node.

I want to use DistributedCache to cache the local file system files in
CommitPending stage from TaskImpl. But the DistributedCache is actually
deprecated. is there any other way I can do this ?

I think I can use HDFS to save the temporary output files so that other
nodes can see it ? but is there any in-memory solution I can use ?

any pointers are greatly appreciated.

thx & rgds,
srinivas chamarthi

Mime
View raw message