hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Philip Zeyliger <phi...@cloudera.com>
Subject Re: drivers to bridge familiar SQL queries to Hadoop MapReduce internals?
Date Fri, 04 Sep 2009 16:34:49 GMT
Hi Benjamin,

This is actually very much on the mark.

Take a look at the Hive project -- http://hadoop.apache.org/hive/ ,
also video at http://www.cloudera.com/hadoop-training-hive-introduction.
 Hive is a SQL-like interface developed initially at Facebook for
exactly that.  Pig is also working on something similar -- see


-- Philip

On Fri, Sep 4, 2009 at 9:16 AM,
benjamin.cotton@lehman.com<Benjamin.Cotton@lehman.com> wrote:
> I am brand new to Hadoop and have a very newbie question:  Is it a Hadoop
> community priority to  build drivers (or layers of drivers) that will help
> bridge simple, familiar SQL queries to Hadoop MapReduce internals  -
> liberating the application query developer from having to necessarily learn
> Hadoop-specific technologies, APIs, and tactics?
> E.g. in   the "Hadoop - The Definitive Guide" initial example, I would like
> to STILL just be able to write
> Select avg(weatherStationTable.airTemp), max(weatherStationTable.airTemp)
> from   weatherStationTable
> group by  weatherStationTable.year
> and depend on some Driver (or layer of Drivers) to bridge that familiar SQL
> relational query to a Hadoop MapReduce job that is deployed across the HDFS
> (or other  Hadoop-specific data hostng layer) to  execute in Hadoop and
> return my result.
> is the notion of this potential capability off-the mark re: current Hadoop
> community development priorities?

View raw message