hawq-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From dortmont <dortm...@gmail.com>
Subject Re: what is Hawq?
Date Fri, 13 Nov 2015 08:42:50 GMT
I see the advantage of HAWQ compared to other Hadoop SQL engines. It looks
like the most mature solution on Hadoop thanks to the postgresql based
engine.

But why wouldn't I use Greenplum instead of HAWQ? It has even better
performance and it supports updates.

Cheers

2015-11-13 7:45 GMT+01:00 Atri Sharma <atri@apache.org>:

> +1 for transactions.
>
> I think a major plus point is that HAWQ supports transactions,  and this
> enables a lot of critical workloads to be done on HAWQ.
> On 13 Nov 2015 12:13, "Lei Chang" <chang.lei.cn@gmail.com> wrote:
>
>>
>> Like what Bob said, HAWQ is a complete database and Drill is just a query
>> engine.
>>
>> And HAWQ has also a lot of other benefits over Drill, for example:
>>
>> 1. SQL completeness: HAWQ is the best for the sql-on-hadoop engines, can
>> run all TPCDS queries without any changes. And support almost all third
>> party tools, such as Tableau et al.
>> 2. Performance: proved the best in the hadoop world
>> 3. Scalability: high scalable via high speed UDP based interconnect.
>> 4. Transactions: as I know, drill does not support transactions. it is a
>> nightmare for end users to keep consistency.
>> 5. Advanced resource management: HAWQ has the most advanced resource
>> management. It natively supports YARN and easy to use hierarchical resource
>> queues. Resources can be managed and enforced on query and operator level.
>>
>> Cheers
>> Lei
>>
>>
>> On Fri, Nov 13, 2015 at 9:34 AM, Adaryl "Bob" Wakefield, MBA <
>> adaryl.wakefield@hotmail.com> wrote:
>>
>>> There are a lot of tools that do a lot of things. Believe me it’s a full
>>> time job keeping track of what is going on in the apache world. As I
>>> understand it, Drill is just a query engine while Hawq is an actual
>>> database...some what anyway.
>>>
>>> Adaryl "Bob" Wakefield, MBA
>>> Principal
>>> Mass Street Analytics, LLC
>>> 913.938.6685
>>> www.linkedin.com/in/bobwakefieldmba
>>> Twitter: @BobLovesData
>>>
>>> *From:* Will Wagner <wowagner@gmail.com>
>>> *Sent:* Thursday, November 12, 2015 7:42 AM
>>> *To:* user@hawq.incubator.apache.org
>>> *Subject:* Re: what is Hawq?
>>>
>>>
>>> Hi Lie,
>>>
>>> Great answer.
>>>
>>> I have a follow up question.
>>> Everything HAWQ is capable of doing is already covered by Apache Drill.
>>> Why do we need another tool?
>>>
>>> Thank you,
>>> Will W
>>> On Nov 12, 2015 12:25 AM, "Lei Chang" <chang.lei.cn@gmail.com> wrote:
>>>
>>>>
>>>> Hi Bob,
>>>>
>>>>
>>>> Apache HAWQ is a Hadoop native SQL query engine that combines the key
>>>> technological advantages of MPP database with the scalability and
>>>> convenience of Hadoop. HAWQ reads data from and writes data to HDFS
>>>> natively. HAWQ delivers industry-leading performance and linear
>>>> scalability. It provides users the tools to confidently and successfully
>>>> interact with petabyte range data sets. HAWQ provides users with a
>>>> complete, standards compliant SQL interface. More specifically, HAWQ has
>>>> the following features:
>>>>
>>>>    - On-premise or cloud deployment
>>>>    - Robust ANSI SQL compliance: SQL-92, SQL-99, SQL-2003, OLAP
>>>>    extension
>>>>    - Extremely high performance. many times faster than other Hadoop
>>>>    SQL engine.
>>>>    - World-class parallel optimizer
>>>>    - Full transaction capability and consistency guarantee: ACID
>>>>    - Dynamic data flow engine through high speed UDP based
>>>>    interconnect
>>>>    - Elastic execution engine based on virtual segment & data locality
>>>>    - Support multiple level partitioning and List/Range based
>>>>    partitioned tables.
>>>>    - Multiple compression method support: snappy, gzip, quicklz, RLE
>>>>    - Multi-language user defined function support: python, perl, java,
>>>>    c/c++, R
>>>>    - Advanced machine learning and data mining functionalities through
>>>>    MADLib
>>>>    - Dynamic node expansion: in seconds
>>>>    - Most advanced three level resource management: Integrate with
>>>>    YARN and hierarchical resource queues.
>>>>    - Easy access of all HDFS data and external system data (for
>>>>    example, HBase)
>>>>    - Hadoop Native: from storage (HDFS), resource management (YARN) to
>>>>    deployment (Ambari).
>>>>    - Authentication & Granular authorization: Kerberos, SSL and role
>>>>    based access
>>>>    - Advanced C/C++ access library to HDFS and YARN: libhdfs3 & libYARN
>>>>    - Support most third party tools: Tableau, SAS et al.
>>>>    - Standard connectivity: JDBC/ODBC
>>>>
>>>>
>>>> And the link here can give you more information around hawq:
>>>> https://cwiki.apache.org/confluence/display/HAWQ/About+HAWQ
>>>>
>>>>
>>>> And please also see the answers inline to your specific questions:
>>>>
>>>> On Thu, Nov 12, 2015 at 4:09 PM, Adaryl "Bob" Wakefield, MBA <
>>>> adaryl.wakefield@hotmail.com> wrote:
>>>>
>>>>> Silly question right? Thing is I’ve read a bit and watched some
>>>>> YouTube videos and I’m still not quite sure what I can and can’t
do with
>>>>> Hawq. Is it a true database or is it like Hive where I need to use
>>>>> HCatalog?
>>>>>
>>>>
>>>> It is a true database, you can think it is like a parallel postgres but
>>>> with much more functionalities and it works natively in hadoop world.
>>>> HCatalog is not necessary. But you can read data registered in HCatalog
>>>> with the new feature "hcatalog integration".
>>>>
>>>>
>>>>> Can I write data intensive applications against it using ODBC? Does it
>>>>> enforce referential integrity? Does it have stored procedures?
>>>>>
>>>>
>>>> ODBC: yes, both JDBC/ODBC are supported
>>>> referential integrity: currently not supported.
>>>> Stored procedures: yes.
>>>>
>>>>
>>>>> B.
>>>>>
>>>>
>>>>
>>>> Please let us know if you have any other questions.
>>>>
>>>> Cheers
>>>> Lei
>>>>
>>>>
>>>>
>>>
>>

Mime
View raw message