From Bhaskar Ghosh <bjgin...@yahoo.co.in>
Subject Re: How to create a composite value object for output from Map method
Date Wed, 22 Sep 2010 12:02:20 GMT
Chris / All,

Any idea why this is error-ing out?

From: Bhaskar Ghosh <bjgindia@yahoo.co.in>
To: mapreduce-user@hadoop.apache.org
Sent: Tue, 21 September, 2010 12:08:21 AM
Subject: Re: How to create a composite value object for output from Map method

Hello All,

Thanks Chris for your suggestion and time.

I tried as you said. Now it is giving me runtime NullPointerException

>at org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:183)
>at org.apache.hadoop.io.ObjectWritable.readFields(ObjectWritable.java:66)
>at org.apache.hadoop.mapreduce.ReduceContext.nextKey(ReduceContext.java:92)
>at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:175)
>at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:566)
>at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
>at org.apache.hadoop.mapred.Child.main(Child.java:170)

I just tried the map method where it tries to write the MyCompositeValueWritable 
object as the value. I have commented out the reducer class. But still the error 
has come inside MyCompositeValueWritable.readFields().

>//commented out
>//    job.setCombinerClass(IntSumReducer.class);
>//     job.setReducerClass(IntSumReducer.class);

My map method looks like this:

public void map(Object key, Text value, Context context ) throws IOException, 
InterruptedException {
>      StringTokenizer itr = new StringTokenizer(value.toString());
>      while (itr.hasMoreTokens()) {
>        word.set(itr.nextToken());
>        MyCompositeValueWritable compositeValue = new 
>        compositeValue.addToList(localname);
>        compositeValue.setValue(1);
>        context.write (word, compositeValue);
My readFields and write methods' source is below:

public void readFields(DataInput in) throws IOException {
>ObjectWritable list = new ObjectWritable(String[].class, 
>listOfString.toArray(new String[]{""}));
>value =  in.readInt();
>public void write(DataOutput out) throws IOException {
>ObjectWritable list = new ObjectWritable(String[].class, 
>listOfString.toArray(new String[]{""}));
Has anybody faced similar issue? Appreciate any help.
Bhaskar Ghosh
Hyderabad, India


"Ignorance is Bliss... Knowledge never brings Peace!!!"

From: "Christopher.Shain@sungard.com" <Christopher.Shain@sungard.com>
To: mapreduce-user@hadoop.apache.org
Sent: Sun, 19 September, 2010 9:12:20 PM
Subject: RE: How to create a composite value object for output from Map method

I think your first approach at serializing is correct, except for the use of 
ObjectWritable.  From the docs, ObjectWritable only handles Strings, Arrays, and 
primitives.  You are trying to use it to serialize your ArrayList.  Try 
converting the ArrayList to an array of Strings first.
As for the second problem, I’d have a look at Cascading
Hope these help…
From:Bhaskar Ghosh [mailto:bjgindia@yahoo.co.in] 
Sent: Sunday, September 19, 2010 6:22 AM
To: mapreduce-user@hadoop.apache.org
Subject: How to create a composite value object for output from Map method
Hi All,
What would be the right approach to solve this problem:
	1. I need to      output an object as the value from my map method. The 
object's class      should have two mambers: an ArrayList<String> and another, 
an      integer.

I used following two ways, but they are not working:
	* I      wrote a class MyCompositeValueWritable that implements Writable      
	* Inside      the overridden readFields and write methods, I try to read/write 
using the      ObjectWritable class.
	* [see      attached file MyWordCount_ObjVal1_2.java]

	* The      custom class is a plain class 'MyCompositeValue' not implementing or      
inheriting anything.
	* The      Map and Reduce methods try to output the <key, value=<object of      
MyCompositeValue> > using the ObjectWritable class.
	* [see      attached file Case2.txt]
	* Am I going wrong somewhere? Appreciate any help.
	1. I have      another problem, in which I need two types of mappers and 
reducer, and I      want to execute them in this order:
	* Mapper1        -> Reducer1 -> Mapper2 -> Reducer2
	* Is it        possible through ChainMapper and/or ChainReducer classes? It 
yes, then        how? Can anybody provide some starting working example, or 
point me to        some good url for the same?
	* Currently,        I am doing it as a work-around:
	* The first         set of Mapper-Reducer write to HDFS. Then the second set of         
Mapper-Reducer pick up that output file from HDFS and writes further         
processed output to another HDFS directory.
	* An example        would be really really helpful.
Bhaskar Ghosh

"Ignorance is Bliss... Knowledge never brings Peace!!!"

