hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Guillaume Polaert <gpola...@cyres.fr>
Subject RE: Database insertion by HAdoop
Date Wed, 20 Feb 2013 15:24:39 GMT
Hello Masoud,

Did you look at sqoop (http://sqoop.apache.org)? Maybe it can help you.

I think there also is a specific FileInputFormat designed for databases. I don't remember
the name. If you use it, you need just write a mapper.

Guillaume Polaert | Cyrès Conseil 

-----Message d'origine-----
De : Masoud [mailto:masoud@agape.hanyang.ac.kr] 
Envoyé : lundi 18 février 2013 12:20
À : common-user@hadoop.apache.org
Objet : Database insertion by HAdoop

Dear All,

We are going to do our experiment of a scientific papers, ] We must insert data in our database
for later consideration, it almost
300 tables each one has 2/000/000 records.
as you know It takes lots of time to do it with a single machine, we are going to use our
Hadoop cluster (32 machines) and divide 300 insertion tasks between them, I need some hint
to progress faster,
1- as i know we dont need to Reduser, just Mapper in enough.
2- so wee need just implement Mapper class with needed code.

Please let me know if there is any point,

Best Regards

View raw message