hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adeel Qureshi <adeelmahm...@gmail.com>
Subject Writing Reducer output to database
Date Thu, 03 Feb 2011 22:45:41 GMT
I had started a thread recently to ask questions about custom writable
implementations which is basically similar to this .. but that was more of
an understanding of the concept and here I wanted to ask my actual problem
and get help on that.

I want to be able to read text data line by line in my mapper ..
create an instance of a custom writable class that holds some information
parsed out of the line ..
pass that custom writable along with its count to reducer
reducer then simply need to insert every single entry into a database ..

I am just trying to understand how to accomplish this. here is what I am
thinking i need to do based on my little understanding of all this custom

1. create a custom writable class that can hold my parsed records. in my
mapper create a new instance of it using the text line and output the
created instance
2. accept this custom writable in mapper
3. set reducer output to DBOutputFormat
    I tried doing that and it seems like I am supposed to use JobConf class
which is deprecated and the new configuration class where you are supposed
to use the job object to set the input/output formats doesnt seems to work
with DBOuputFormat .. doesnt this DBOutputFormat stuff works with hadoop new

4. now in reducer I am confused wat to do .. i guess i need to convert my
custom writable object to another custom dbwritable object .. that will then
be written to the database .. any hints on how to accomplish this ..

Sorry if the questions arent very clear .. I am just really confused about
this stuff and it doesnt helps that there is literally NO useful information
available anywhere on this writable and dbwritable stuff


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message