hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cristóbal Giadach <cristoba...@gmail.com>
Subject Re: Use Hadoop and other Apache products for SQL query manipulations
Date Wed, 18 Jun 2014 15:39:07 GMT
Try impala or Hawk(
http://www.gopivotal.com/sites/default/files/Hawq_WP_042313_FINAL.pdf), in
my opinion the best choice for SQL-on-Hadoop.


On Wed, Jun 18, 2014 at 11:26 AM, Fengjiao Jiang <grapejudy@gmail.com>
wrote:

> Hi,
>
> We have a large data set originally stored on MS SQL and for intensive
> data aggregation manipulation, we're currently using Vertica. The thing is
> the data is very large and sometimes, a "select" or "insert" query which is
> very complex may needs even 10 minutes to return the correct results. (the
> database size is maybe 2GB)
>
> So we're thinking whether we can use Hadoop together with some other
> Apache Products (built on hadoop) to make the query faster.
> For example, if we can use Hadoop & HBase & ZooKeeper and write MR
> functions for these "SELECT" "INSERT" or complex queries like that to
> improve the query speed?
>
> Also, I don't know if the combination I listed above is a good one, should
> I use Hadoop, HBase and ZooKeepr or should I use Hadoop, Pig and Hive?
>
> My question is mainly a "SQL-on-Hadoop" thing, would please tell me if
> it's possible and if so, would you give me some suggestions? I do
> appreciate it a lot !
>
>
> Thanks.
>
> Best
> Judy
>

Mime
View raw message