hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sonal Goyal <sonalgoy...@gmail.com>
Subject Re: Import data from mysql
Date Sun, 09 Jan 2011 02:57:34 GMT
Hi Brian,

You can check HIHO at https://github.com/sonalgoyal/hiho which can help you
load data from any JDBC database to the Hadoop file system. If your table
has a date or id field, or any indicator for modified/newly added rows, you
can import only the altered rows every day. Please let me know if you need

Thanks and Regards,
<https://github.com/sonalgoyal/hiho>Connect Hadoop with databases,
Salesforce, FTP servers and others <https://github.com/sonalgoyal/hiho>
Nube Technologies <http://www.nubetech.co>


On Sun, Jan 9, 2011 at 5:03 AM, Brian McSweeney

> Hi folks,
> I'm a TOTAL newbie on hadoop. I have an existing webapp that has a growing
> number of rows in a mysql database that I have to compare against one
> another once a day from a batch job. This is an exponential problem as
> every
> row must be compared against every other row. I was thinking of
> parallelizing this computation via hadoop. As such, I was thinking that
> perhaps the first thing to look at is how to bring info from a database to
> a
> hadoop job and vise versa. I have seen the following relevant info
> https://issues.apache.org/jira/browse/HADOOP-2536
> and also
> http://architects.dzone.com/articles/tools-moving-sql-database
> any advice on what approach to use?
> cheers,
> Brian

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message