Mailing-List: contact user-help@hawq.incubator.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hawq.incubator.apache.org
MIME-Version: 1.0
In-Reply-To: 
 <CAE44UQfvU5Dn8nYONTOrKAScui386USjPgeF3QhvPffWXToZ5w@mail.gmail.com>
References: <3a03bd1e.6ebf.150dae985fd.Coremail.hawqstudy@163.com>
	<2da2dad9.bc6f.150dbd9622f.Coremail.hawqstudy@163.com>
	<7a189f42.d7ab.150dc178d50.Coremail.hawqstudy@163.com>
	<CAG+aQFnv=iXuD-Zxc2qGU3b3+eJ0vxSQqaAwFdperg+bnXDf=A@mail.gmail.com>
	<CAEURnf2MeXd_toKMyk168u9QsEQs-mDDwr-jYCZULMWhJv6jWQ@mail.gmail.com>
	<BAY167-DS38737DC7F3D972299BCDA998120@phx.gbl>
	<CAE44UQf3mrsJdt=DDqnzYXYWX5fn5hb5XT_sQC5G-8ccByjNeQ@mail.gmail.com>
	<CAAvV0c6G+dujcCB1DuP4tm35us778iinQ1egvMfUPdR5MdbG5Q@mail.gmail.com>
	<BAY167-DS10BBBF9A79BDAFF3EF64C198110@phx.gbl>
	<CAE44UQfvU5Dn8nYONTOrKAScui386USjPgeF3QhvPffWXToZ5w@mail.gmail.com>
Date: Fri, 13 Nov 2015 12:15:57 +0530
Message-ID: 
 <CAOeZVieLcx5WugC9a8qc=Nd_-sKPXXp6+2RkTDKMUMN1bxPv5A@mail.gmail.com>
Subject: Re: what is Hawq?
From: Atri Sharma <atri@apache.org>
To: user@hawq.incubator.apache.org
Content-Type: multipart/alternative; boundary=047d7bae452a7a4a6505246669d3

--047d7bae452a7a4a6505246669d3
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

+1 for transactions.

I think a major plus point is that HAWQ supports transactions,  and this
enables a lot of critical workloads to be done on HAWQ.
On 13 Nov 2015 12:13, "Lei Chang" <chang.lei.cn@gmail.com> wrote:

>
> Like what Bob said, HAWQ is a complete database and Drill is just a query
> engine.
>
> And HAWQ has also a lot of other benefits over Drill, for example:
>
> 1. SQL completeness: HAWQ is the best for the sql-on-hadoop engines, can
> run all TPCDS queries without any changes. And support almost all third
> party tools, such as Tableau et al.
> 2. Performance: proved the best in the hadoop world
> 3. Scalability: high scalable via high speed UDP based interconnect.
> 4. Transactions: as I know, drill does not support transactions. it is a
> nightmare for end users to keep consistency.
> 5. Advanced resource management: HAWQ has the most advanced resource
> management. It natively supports YARN and easy to use hierarchical resour=
ce
> queues. Resources can be managed and enforced on query and operator level=
.
>
> Cheers
> Lei
>
>
> On Fri, Nov 13, 2015 at 9:34 AM, Adaryl "Bob" Wakefield, MBA <
> adaryl.wakefield@hotmail.com> wrote:
>
>> There are a lot of tools that do a lot of things. Believe me it=E2=80=99=
s a full
>> time job keeping track of what is going on in the apache world. As I
>> understand it, Drill is just a query engine while Hawq is an actual
>> database...some what anyway.
>>
>> Adaryl "Bob" Wakefield, MBA
>> Principal
>> Mass Street Analytics, LLC
>> 913.938.6685
>> www.linkedin.com/in/bobwakefieldmba
>> Twitter: @BobLovesData
>>
>> *From:* Will Wagner <wowagner@gmail.com>
>> *Sent:* Thursday, November 12, 2015 7:42 AM
>> *To:* user@hawq.incubator.apache.org
>> *Subject:* Re: what is Hawq?
>>
>>
>> Hi Lie,
>>
>> Great answer.
>>
>> I have a follow up question.
>> Everything HAWQ is capable of doing is already covered by Apache Drill.
>> Why do we need another tool?
>>
>> Thank you,
>> Will W
>> On Nov 12, 2015 12:25 AM, "Lei Chang" <chang.lei.cn@gmail.com> wrote:
>>
>>>
>>> Hi Bob,
>>>
>>>
>>> Apache HAWQ is a Hadoop native SQL query engine that combines the key
>>> technological advantages of MPP database with the scalability and
>>> convenience of Hadoop. HAWQ reads data from and writes data to HDFS
>>> natively. HAWQ delivers industry-leading performance and linear
>>> scalability. It provides users the tools to confidently and successfull=
y
>>> interact with petabyte range data sets. HAWQ provides users with a
>>> complete, standards compliant SQL interface. More specifically, HAWQ ha=
s
>>> the following features:
>>>
>>>    - On-premise or cloud deployment
>>>    - Robust ANSI SQL compliance: SQL-92, SQL-99, SQL-2003, OLAP
>>>    extension
>>>    - Extremely high performance. many times faster than other Hadoop
>>>    SQL engine.
>>>    - World-class parallel optimizer
>>>    - Full transaction capability and consistency guarantee: ACID
>>>    - Dynamic data flow engine through high speed UDP based interconnect
>>>    - Elastic execution engine based on virtual segment & data locality
>>>    - Support multiple level partitioning and List/Range based
>>>    partitioned tables.
>>>    - Multiple compression method support: snappy, gzip, quicklz, RLE
>>>    - Multi-language user defined function support: python, perl, java,
>>>    c/c++, R
>>>    - Advanced machine learning and data mining functionalities through
>>>    MADLib
>>>    - Dynamic node expansion: in seconds
>>>    - Most advanced three level resource management: Integrate with YARN
>>>    and hierarchical resource queues.
>>>    - Easy access of all HDFS data and external system data (for
>>>    example, HBase)
>>>    - Hadoop Native: from storage (HDFS), resource management (YARN) to
>>>    deployment (Ambari).
>>>    - Authentication & Granular authorization: Kerberos, SSL and role
>>>    based access
>>>    - Advanced C/C++ access library to HDFS and YARN: libhdfs3 & libYARN
>>>    - Support most third party tools: Tableau, SAS et al.
>>>    - Standard connectivity: JDBC/ODBC
>>>
>>>
>>> And the link here can give you more information around hawq:
>>> https://cwiki.apache.org/confluence/display/HAWQ/About+HAWQ
>>>
>>>
>>> And please also see the answers inline to your specific questions:
>>>
>>> On Thu, Nov 12, 2015 at 4:09 PM, Adaryl "Bob" Wakefield, MBA <
>>> adaryl.wakefield@hotmail.com> wrote:
>>>
>>>> Silly question right? Thing is I=E2=80=99ve read a bit and watched som=
e YouTube
>>>> videos and I=E2=80=99m still not quite sure what I can and can=E2=80=
=99t do with Hawq. Is
>>>> it a true database or is it like Hive where I need to use HCatalog?
>>>>
>>>
>>> It is a true database, you can think it is like a parallel postgres but
>>> with much more functionalities and it works natively in hadoop world.
>>> HCatalog is not necessary. But you can read data registered in HCatalog
>>> with the new feature "hcatalog integration".
>>>
>>>
>>>> Can I write data intensive applications against it using ODBC? Does it
>>>> enforce referential integrity? Does it have stored procedures?
>>>>
>>>
>>> ODBC: yes, both JDBC/ODBC are supported
>>> referential integrity: currently not supported.
>>> Stored procedures: yes.
>>>
>>>
>>>> B.
>>>>
>>>
>>>
>>> Please let us know if you have any other questions.
>>>
>>> Cheers
>>> Lei
>>>
>>>
>>>
>>
>

--047d7bae452a7a4a6505246669d3
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<p dir=3D"ltr">+1 for transactions.</p>
<p dir=3D"ltr">I think a major plus point is that HAWQ supports transaction=
s,=C2=A0 and this enables a lot of critical workloads to be done on HAWQ.</=
p>
<div class=3D"gmail_quote">On 13 Nov 2015 12:13, &quot;Lei Chang&quot; &lt;=
<a href=3D"mailto:chang.lei.cn@gmail.com">chang.lei.cn@gmail.com</a>&gt; wr=
ote:<br type=3D"attribution"><blockquote class=3D"gmail_quote" style=3D"mar=
gin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir=3D"ltr=
"><div><br></div>Like what Bob said, HAWQ is a complete database and Drill =
is just a query engine.<div><br></div><div>And HAWQ has also a lot of other=
 benefits over Drill, for example:</div><div><br></div><div>1. SQL complete=
ness: HAWQ is the best for the sql-on-hadoop engines, can run all TPCDS que=
ries without any changes. And support almost all third party tools, such as=
 Tableau et al.</div><div>2. Performance: proved the best in the hadoop wor=
ld</div><div>3. Scalability: high scalable via high speed UDP based interco=
nnect.</div><div>4. Transactions: as I know, drill does not support transac=
tions. it is a nightmare for end users to keep consistency.<br></div><div><=
div>5. Advanced resource management: HAWQ has the most advanced resource ma=
nagement. It natively supports YARN and easy to use hierarchical resource q=
ueues. Resources can be managed and enforced on query and operator level.</=
div></div><div><br></div><div>Cheers</div><div>Lei</div><div><br></div><div=
><br></div><div><div class=3D"gmail_extra"><div class=3D"gmail_quote">On Fr=
i, Nov 13, 2015 at 9:34 AM, Adaryl &quot;Bob&quot; Wakefield, MBA <span dir=
=3D"ltr">&lt;<a href=3D"mailto:adaryl.wakefield@hotmail.com" target=3D"_bla=
nk">adaryl.wakefield@hotmail.com</a>&gt;</span> wrote:<br><blockquote class=
=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left-width:1px;bo=
rder-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div dir=3D"ltr">
<div dir=3D"ltr">
<div style=3D"font-size:12pt;font-family:Calibri;color:rgb(0,0,0)">
<div>There are a lot of tools that do a lot of things. Believe me it=E2=80=
=99s a full=20
time job keeping track of what is going on in the apache world. As I unders=
tand=20
it, Drill is just a query engine while Hawq is an actual database...some wh=
at=20
anyway.</div>
<div>=C2=A0</div>
<div style=3D"font-size:12pt;font-family:Calibri;color:rgb(0,0,0)">Adaryl=
=20
&quot;Bob&quot; Wakefield, MBA<br>Principal<br>Mass Street Analytics,=20
LLC<br><a href=3D"tel:913.938.6685" value=3D"+19139386685" target=3D"_blank=
">913.938.6685</a><br><a href=3D"http://www.linkedin.com/in/bobwakefieldmba=
" target=3D"_blank">www.linkedin.com/in/bobwakefieldmba</a><br>Twitter:=20
@BobLovesData</div>
<div style=3D"font-size:small;text-decoration:none;font-family:Calibri;font=
-weight:normal;color:rgb(0,0,0);font-style:normal;display:inline">
<div style=3D"font-style:normal;font-variant:normal;font-weight:normal;font=
-stretch:normal;font-size:10pt;line-height:normal;font-family:tahoma">
<div>=C2=A0</div>
<div style=3D"background:rgb(245,245,245)">
<div><b>From:</b> <a title=3D"wowagner@gmail.com" href=3D"mailto:wowagner@g=
mail.com" target=3D"_blank">Will Wagner</a> </div>
<div><b>Sent:</b> Thursday, November 12, 2015 7:42 AM</div>
<div><b>To:</b> <a title=3D"user@hawq.incubator.apache.org" href=3D"mailto:=
user@hawq.incubator.apache.org" target=3D"_blank">user@hawq.incubator.apach=
e.org</a>=20
</div>
<div><b>Subject:</b> Re: what is Hawq?</div></div></div>
<div>=C2=A0</div></div><div><div>
<div style=3D"font-size:small;text-decoration:none;font-family:Calibri;font=
-weight:normal;color:rgb(0,0,0);font-style:normal;display:inline">
<p dir=3D"ltr">Hi Lie,</p>
<p dir=3D"ltr">Great answer. </p>
<p dir=3D"ltr">I have a follow up question. <br>Everything HAWQ is capable =
of doing=20
is already covered by Apache Drill.=C2=A0 Why do we need another tool?</p>
<p dir=3D"ltr">Thank you, <br>Will W </p>
<div class=3D"gmail_quote">On Nov 12, 2015 12:25 AM, &quot;Lei Chang&quot; =
&lt;<a href=3D"mailto:chang.lei.cn@gmail.com" target=3D"_blank">chang.lei.c=
n@gmail.com</a>&gt; wrote:<br type=3D"attribution">
<blockquote class=3D"gmail_quote" style=3D"padding-left:1ex;margin:0px 0px =
0px 0.8ex;border-left-color:rgb(204,204,204);border-left-width:1px;border-l=
eft-style:solid">
  <div dir=3D"ltr">
  <div>=C2=A0</div>
  <div>Hi Bob, </div>
  <div>=C2=A0</div>
  <div>
  <p style=3D"font-size:14px;font-family:arial,sans-serif;color:rgb(51,51,5=
1);padding:0px;margin:0px;line-height:20px">Apache=20
  HAWQ is a Hadoop native SQL query engine that combines the key technologi=
cal=20
  advantages of MPP database with the scalability and convenience of Hadoop=
.=20
  HAWQ reads data from and writes data to HDFS natively. HAWQ delivers=20
  industry-leading performance and linear scalability. It provides users th=
e=20
  tools to confidently and successfully interact with petabyte range data s=
ets.=20
  HAWQ provides users with a complete, standards compliant SQL interface. M=
ore=20
  specifically, HAWQ has the following features:</p>
  <ul style=3D"font-size:14px;font-family:arial,sans-serif;color:rgb(51,51,=
51);margin:10px 0px 0px;line-height:20px">
    <li>On-premise or cloud deployment=20
    </li><li>Robust ANSI SQL compliance: SQL-92, SQL-99, SQL-2003, OLAP ext=
ension=20
    </li><li><span style=3D"line-height:1.4285">Extremely high performance.=
 many times=20
    faster than other Hadoop SQL engine.</span>=20
    </li><li>World-class parallel optimizer=20
    </li><li>Full transaction capability and consistency guarantee: ACID=20
    </li><li>Dynamic data flow engine through high speed UDP based intercon=
nect=20
    </li><li>Elastic execution engine based on virtual segment &amp; data l=
ocality=20
    </li><li>Support multiple level partitioning and List/Range based parti=
tioned=20
    tables.=20
    </li><li>Multiple compression method support: snappy, gzip, quicklz, RL=
E=20
    </li><li>Multi-language user defined function support: python, perl, ja=
va, c/c++,=20
    R=20
    </li><li>Advanced machine learning and data mining functionalities thro=
ugh MADLib=20

    </li><li>Dynamic node expansion: in seconds=20
    </li><li>Most advanced three level resource management: Integrate with =
YARN and=20
    hierarchical resource queues.=20
    </li><li>Easy access of all HDFS data and external system data (for exa=
mple,=20
    HBase)=20
    </li><li>Hadoop Native: from storage (HDFS), resource management (YARN)=
 to=20
    deployment (Ambari).=20
    </li><li>Authentication &amp; Granular authorization: Kerberos, SSL and=
 role=20
    based access=20
    </li><li><span style=3D"line-height:1.4285">Advanced C/C++ access libra=
ry to HDFS=20
    and YARN: libhdfs3 &amp; libYARN</span>=20
    </li><li>Support most third party tools: Tableau, SAS et al.<br>
    </li><li>Standard connectivity: JDBC/ODBC</li></ul>
  <div>=C2=A0</div></div>
  <div>And the link here can give you more information around hawq: <a href=
=3D"https://cwiki.apache.org/confluence/display/HAWQ/About+HAWQ" target=3D"=
_blank">https://cwiki.apache.org/confluence/display/HAWQ/About+HAWQ</a>=C2=
=A0<br></div>
  <div>=C2=A0</div>
  <div>=C2=A0</div>
  <div>And please also see the answers inline to your specific questions:</=
div>
  <div class=3D"gmail_extra">
  <div>=C2=A0</div>
  <div class=3D"gmail_quote">On Thu, Nov 12, 2015 at 4:09 PM, Adaryl &quot;=
Bob&quot;=20
  Wakefield, MBA <span dir=3D"ltr">&lt;<a href=3D"mailto:adaryl.wakefield@h=
otmail.com" target=3D"_blank">adaryl.wakefield@hotmail.com</a>&gt;</span> w=
rote:<br>
  <blockquote class=3D"gmail_quote" style=3D"padding-left:1ex;margin:0px 0p=
x 0px 0.8ex;border-left-color:rgb(204,204,204);border-left-width:1px;border=
-left-style:solid">
    <div dir=3D"ltr">
    <div dir=3D"ltr">
    <div style=3D"font-size:12pt;font-family:calibri;color:rgb(0,0,0)">
    <div>
    <div style=3D"font-size:small;text-decoration:none;font-family:calibri;=
font-weight:normal;color:rgb(0,0,0);font-style:normal;display:inline">Silly=
=20
    question right? Thing is I=E2=80=99ve read a bit and watched some YouTu=
be videos and=20
    I=E2=80=99m still not quite sure what I can and can=E2=80=99t do with H=
awq. Is it a true=20
    database or is it like Hive where I need to use HCatalog?=20
    </div></div></div></div></div></blockquote>
  <div>=C2=A0</div>
  <div>It is a true database, you can think it is like a parallel postgres =
but=20
  with much more functionalities and it works natively in hadoop world. HCa=
talog=20
  is not necessary. But you can read data registered in HCatalog with the n=
ew=20
  feature &quot;hcatalog integration&quot;.</div>
  <div>=C2=A0</div>
  <blockquote class=3D"gmail_quote" style=3D"padding-left:1ex;margin:0px 0p=
x 0px 0.8ex;border-left-color:rgb(204,204,204);border-left-width:1px;border=
-left-style:solid">
    <div dir=3D"ltr">
    <div dir=3D"ltr">
    <div style=3D"font-size:12pt;font-family:calibri;color:rgb(0,0,0)">
    <div>
    <div style=3D"font-size:small;text-decoration:none;font-family:calibri;=
font-weight:normal;color:rgb(0,0,0);font-style:normal;display:inline">Can=
=20
    I write data intensive applications against it using ODBC? Does it enfo=
rce=20
    referential integrity? Does it have stored=20
    procedures?</div></div></div></div></div></blockquote>
  <div>=C2=A0</div>
  <div>ODBC: yes, both JDBC/ODBC are supported</div>
  <div><span style=3D"font-family:calibri;color:rgb(0,0,0)">referential=20
  integrity: currently not supported.</span></div>
  <div><font color=3D"#000000" face=3D"Calibri">Stored procedures: yes.</fo=
nt></div>
  <div>=C2=A0</div>
  <blockquote class=3D"gmail_quote" style=3D"padding-left:1ex;margin:0px 0p=
x 0px 0.8ex;border-left-color:rgb(204,204,204);border-left-width:1px;border=
-left-style:solid">
    <div dir=3D"ltr">
    <div dir=3D"ltr">
    <div style=3D"font-size:12pt;font-family:calibri;color:rgb(0,0,0)"><spa=
n><font color=3D"#888888">
    <div>
    <div style=3D"font-size:small;text-decoration:none;font-family:calibri;=
font-weight:normal;color:rgb(0,0,0);font-style:normal;display:inline">B.</d=
iv></div></font></span></div></div></div></blockquote></div>
  <div>=C2=A0</div></div>
  <div class=3D"gmail_extra">=C2=A0</div>
  <div class=3D"gmail_extra">Please let us know if you have any other=20
  questions.</div>
  <div class=3D"gmail_extra">=C2=A0</div>
  <div class=3D"gmail_extra">Cheers</div>
  <div class=3D"gmail_extra">Lei</div>
  <div class=3D"gmail_extra">=C2=A0</div>
  <div class=3D"gmail_extra">=C2=A0</div></div></blockquote></div></div></d=
iv></div></div></div></div>
</blockquote></div><br></div></div></div>
</blockquote></div>

--047d7bae452a7a4a6505246669d3--