hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sonal Goyal <sonalgoy...@gmail.com>
Subject Re: How to split DBInputFormat?
Date Mon, 03 Jan 2011 17:53:40 GMT
Hi Joan,

To get data from the database, you can check the open source framework HIHO
at https://github.com/sonalgoyal/hiho/

By providing details of your database and table to import as the
configuration values, the split will happen automatically for you. Please
feel free to write to me directly in case you see any issues.

Thanks and Regards,
<https://github.com/sonalgoyal/hiho>Connect Hadoop with databases,
Salesforce, FTP servers and others <https://github.com/sonalgoyal/hiho>
Nube Technologies <http://www.nubetech.co>


On Mon, Jan 3, 2011 at 10:26 PM, Joan <joan.monplet@gmail.com> wrote:

> Hi,
> I'm trying load data from big table in Database. I'm using DBInputFormat
> but when my Job try to get all records, It throws an execption:
> *Exception in thread "Thread for syncLogs" java.lang.OutOfMemoryError:
> Java heap space*
> I'm trying to get millions of records and I would like using DBInputSplit
> but I don't know how I used it and how many split I need?
> Thanks
> Joan

View raw message