hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ulul <had...@ulul.org>
Subject Re: cleanup() in hadoop results in aggregation of whole file/not
Date Sun, 01 Mar 2015 12:41:41 GMT
Hi

I probably misunderstood your question because my impression is that 
it's typically a job for a reducer. Emit "local" min and max with two 
keys from each mapper and you will easily get gobal min and max in reducer

Ulul
Le 28/02/2015 14:10, Shahab Yunus a écrit :
> As far as I understand cleanup is called per task. In your case I.e. 
> per map task. To get an overall count or measure, you need to 
> aggregate it yourself after the job is done.
>
> One way to do that is to use counters and then merge them 
> programmatically at the end of the job.
>
> Regards,
> Shahab
>
> On Saturday, February 28, 2015, unmesha sreeveni 
> <unmeshabiju@gmail.com <mailto:unmeshabiju@gmail.com>> wrote:
>
>
>     ​I am having an input file, which contains last column as class label
>     7.4 0.29 0.5 1.8 0.042 35 127 0.9937 3.45 0.5 10.2 7 1
>     10 0.41 0.45 6.2 0.071 6 14 0.99702 3.21 0.49 11.8 7 -1
>     7.8 0.26 0.27 1.9 0.051 52 195 0.9928 3.23 0.5 10.9 6 1
>     6.9 0.32 0.3 1.8 0.036 28 117 0.99269 3.24 0.48 11 6 1
>     ...................
>     I am trying to get the unique class label of the whole file.
>     Inorder to get the same I am doing the below code.
>
>     /public class MyMapper extends Mapper<LongWritable, Text,
>     IntWritable, FourvalueWritable>{/
>     /    Set<String> uniqueLabel = new HashSet();/
>     /
>     /
>     /    public void map(LongWritable key,Text value,Context context){/
>     /        //Last column of input is classlabel./
>     /         Vector<String> cls = CustomParam.customLabel(line,
>     delimiter, classindex); // /
>     /         uniqueLabel.add(cls.get(0));/
>     /    }/
>     /    public void cleanup(Context context) throws IOException{/
>     /        //find min and max label/
>     / context.getCounter(UpdateCost.MINLABEL).setValue(Long.valueOf(minLabel));/
>     / context.getCounter(UpdateCost.MAXLABEL).setValue(Long.valueOf(maxLabel));/
>     /}/
>     Cleanup is only executed for once.
>
>     And after each map whether "Set uniqueLabel = new HashSet();" the
>     set get updated,Hope that set get updated for each map?
>     Hope I am able to get the uniqueLabel of the whole file in cleanup
>     Please suggest if I am wrong.
>
>     Thanks in advance.
>
>


Mime
View raw message