Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hadoop.apache.org
Received-SPF: pass (athena.apache.org: domain of grapejudy@gmail.com
 designates 209.85.213.172 as permitted sender)
MIME-Version: 1.0
Date: Wed, 18 Jun 2014 11:26:30 -0400
Message-ID: 
 <CAEb_DvBcEg-tca1z_yt-HaO8KmJ9AVbEuRWRkmjhG0YNco5VNg@mail.gmail.com>
Subject: Use Hadoop and other Apache products for SQL query manipulations
From: Fengjiao Jiang <grapejudy@gmail.com>
To: user@hadoop.apache.org
Content-Type: multipart/alternative; boundary=089e010d9bda7cf05204fc1de26d

--089e010d9bda7cf05204fc1de26d
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Hi,

We have a large data set originally stored on MS SQL and for intensive data
aggregation manipulation, we=E2=80=99re currently using Vertica. The thing =
is the
data is very large and sometimes, a =E2=80=9Cselect=E2=80=9D or =E2=80=9Cin=
sert=E2=80=9D query which is
very complex may needs even 10 minutes to return the correct results. (the
database size is maybe 2GB)

So we=E2=80=99re thinking whether we can use Hadoop together with some othe=
r Apache
Products (built on hadoop) to make the query faster.
For example, if we can use Hadoop & HBase & ZooKeeper and write MR
functions for these =E2=80=9CSELECT=E2=80=9D =E2=80=9CINSERT=E2=80=9D or co=
mplex queries like that to
improve the query speed?

Also, I don=E2=80=99t know if the combination I listed above is a good one,=
 should
I use Hadoop, HBase and ZooKeepr or should I use Hadoop, Pig and Hive?

My question is mainly a =E2=80=9CSQL-on-Hadoop=E2=80=9D thing, would please=
 tell me if it=E2=80=99s
possible and if so, would you give me some suggestions? I do appreciate it
a lot !


Thanks.

Best
Judy

--089e010d9bda7cf05204fc1de26d
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><span style=3D"font-family:arial,sans-serif;font-size:14px=
">Hi,</span><br style=3D"font-family:arial,sans-serif;font-size:14px"><br s=
tyle=3D"font-family:arial,sans-serif;font-size:14px"><span style=3D"font-fa=
mily:arial,sans-serif;font-size:14px">We have a large data set originally s=
tored on MS SQL and for intensive data aggregation manipulation, we=E2=80=
=99re currently using Vertica. The thing is the data is very large and some=
times, a =E2=80=9Cselect=E2=80=9D or =E2=80=9Cinsert=E2=80=9D query which i=
s very complex may needs even 10 minutes to return the correct results. (th=
e database size is maybe 2GB)</span><br style=3D"font-family:arial,sans-ser=
if;font-size:14px">
<br style=3D"font-family:arial,sans-serif;font-size:14px"><span style=3D"fo=
nt-family:arial,sans-serif;font-size:14px">So we=E2=80=99re thinking whethe=
r we can use Hadoop together with some other Apache Products (built on hado=
op) to make the query faster.</span><br style=3D"font-family:arial,sans-ser=
if;font-size:14px">
<span style=3D"font-family:arial,sans-serif;font-size:14px">For example, if=
 we can use Hadoop &amp; HBase &amp; ZooKeeper and write MR functions for t=
hese =E2=80=9CSELECT=E2=80=9D =E2=80=9CINSERT=E2=80=9D or complex queries l=
ike that to improve the query speed?</span><br style=3D"font-family:arial,s=
ans-serif;font-size:14px">
<br style=3D"font-family:arial,sans-serif;font-size:14px"><span style=3D"fo=
nt-family:arial,sans-serif;font-size:14px">Also, I don=E2=80=99t know if th=
e combination I listed above is a good one, should I use Hadoop, HBase and =
ZooKeepr or should I use Hadoop, Pig and Hive?</span><br style=3D"font-fami=
ly:arial,sans-serif;font-size:14px">
<br style=3D"font-family:arial,sans-serif;font-size:14px"><span style=3D"fo=
nt-family:arial,sans-serif;font-size:14px">My question is mainly a =E2=80=
=9CSQL-on-Hadoop=E2=80=9D thing, would please tell me if it=E2=80=99s possi=
ble and if so, would you give me some suggestions? I do appreciate it a lot=
 !</span><br style=3D"font-family:arial,sans-serif;font-size:14px">
<br style=3D"font-family:arial,sans-serif;font-size:14px"><br style=3D"font=
-family:arial,sans-serif;font-size:14px"><span style=3D"font-family:arial,s=
ans-serif;font-size:14px">Thanks.</span><br style=3D"font-family:arial,sans=
-serif;font-size:14px">
<br style=3D"font-family:arial,sans-serif;font-size:14px"><span style=3D"fo=
nt-family:arial,sans-serif;font-size:14px">Best</span><br style=3D"font-fam=
ily:arial,sans-serif;font-size:14px"><span style=3D"font-family:arial,sans-=
serif;font-size:14px">Judy</span><br>
</div>

--089e010d9bda7cf05204fc1de26d--