hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "y l" <unopten...@gmx.com>
Subject MultiFilterRecordReader
Date Wed, 18 Aug 2010 17:10:19 GMT

My first email on the list, and overall pretty new to Hadoop, so I'm hoping to find some help
with a new task I have to do for work.
I need to do a join between 2 sets of files. One is a bunch of csv files and the other set
is sequence files. 

I was told MultiFilterRecorderReader could help me do the join, but I haven't been successful
to find some good example on where and how to use that class to do the join.
I have found a good example using CompositeInputFormat here: http://www.congiu.com/node/5
But it assumes that the input is sorted and I can't guarantee that it will be on the csv files
at least. 

Anyone knows what I need to do with that MultiFilterRecorderReader? Inherit it on the mapper?
I'm a little confused... Please let me know if you have any pointers on that one. 


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message