hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gang Luo <lgpub...@yahoo.com.cn>
Subject Re: access to part file in reducer
Date Sat, 17 Jul 2010 19:44:34 GMT
I think when close() is called, part of the data is in the output buffer, the 
file is not complete and it doesn't make sense to access that file at this time, 
unless you can wait until everything is done. 

If you want to manipulate the data, why not defer outputting? You don't output 
anything in map() but just buffer the result, do whatever you want in close() 
then output all of them afterward. Of course, you have to ensure your have 
enough memory.


----- 原始邮件 ----
发件人: abc xyz <fabc_xyz111@yahoo.com>
收件人: common-user@hadoop.apache.org
发送日期: 2010/7/17 (周六) 12:19:47 下午
主   题: access to part file in reducer


Is it possible to get access to the part file generated by a reducer in the 
close function of the reducer for some manipulations and then writing it back? 
If yes, how can I determine the name of the part-file and how can I access it?



View raw message