hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jones, Nick" <nick.jo...@amd.com>
Subject RE: Using set/list data types for intermediate keys
Date Mon, 01 Feb 2010 03:24:48 GMT
Jørn,
I found it fairly quick and simple to implement WritableComparable in a
specific class for the intermediate dataset.  I needed two keys for every
value to make sure each reducer had the right data.  The class just used two
longs internally and implemented the appropriate outputs for
WritableComparable.

It might also be worthwhile looking into the cloud9 library for insights or
implementation: http://www.umiacs.umd.edu/~jimmylin/cloud9/docs/index.html

Nick Jones

-----Original Message-----
From: Jørn Schou-Rode [mailto:jsr@malamute.dk] 
Sent: Sunday, January 31, 2010 3:23 PM
To: common-user@hadoop.apache.org
Subject: Using set/list data types for intermediate keys

What are the options for using sets/lists as keys in the output from the
mapper?

My initial idea was to use ArrayWritable as key type, but that is not
allowed, as the class does not implement WritableComparable. Do I need
to define a custom class, or is there some other set like class in the
Hadoop libraries that can act as key?

Thanks in advance.

/Jørn


Mime
View raw message