hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shirley Cohen <shirl...@cis.upenn.edu>
Subject incremental re-execution
Date Thu, 17 Apr 2008 00:26:08 GMT
Dear Hadoop Users,

I'm writing to find out what you think about being able to  
incrementally re-execute a map reduce job. My understanding is that  
the current framework doesn't support it and I'd like to know  
whether, in your opinion, having this capability could help to speed  
up development and debugging.

My specific questions are:

1) Do you have to re-run a job often enough that it would be valuable  
to incrementally re-run it?

2) Would it be helpful to save the output from a whole bunch of  
mappers and then try to detect whether this output can be re-used  
when a new job is launched?

3) Would it be helpful to be able to use the output from a map job on  
many reducers?

Please let me know what your thoughts are and what specific  
applications you are working on.

Much appreciation,


View raw message