hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dino Kečo <dino.k...@gmail.com>
Subject Re: Extension points available for data locality
Date Tue, 21 Aug 2012 09:22:05 GMT
Hi Mathew,

You should check out this project
http://db.cs.yale.edu/hadoopdb/hadoopdb.html

It uses Hadoop and RDMBS for analytics.

Regards,
Dino Kečo
msn: xdinno@hotmail.com
mail: dino.keco@gmail.com
skype: dino.keco
phone: +387 61 507 851


On Tue, Aug 21, 2012 at 11:06 AM, Tharindu Mathew <mccloud35@gmail.com>wrote:

> Hi,
>
> I'm doing some research that involves pulling data stored in a mysql
> cluster directly for a map reduce job, without storing the data in HDFS.
>
> I'd like to run hadoop task tracker nodes directly on the mysql cluster
> nodes. The purpose of this being, starting mappers directly in the node
> closest to the data if possible (data locality).
>
> I notice that with HDFS, since the name node knows exactly where each data
> block is, it uses this to achieve data locality.
>
> Is there a way to achieve my requirement possibly by extending the name
> node or otherwise?
>
> Thanks in advance.
>
> --
> Regards,
>
> Tharindu
>
> blog: http://mackiemathew.com/
>
>

Mime
View raw message