hawq-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Qi Shao <qs...@pivotal.io>
Subject Re: why hawq off columnorientied table by default?
Date Mon, 29 Feb 2016 03:54:23 GMT
Parquet is available as a storage option for hawq internal tables.

Hawq implements column oriented storage with a file per column.

Eg, storing a table with orientation=column in hawq, if there are 20
segments, 1000 columns, and the table has 500 partitions, in total it will
generate about 20*1000*500 files in hdfs. With orientation=parquet, you
only have 20*1000 files. HDFS is not good at handling a huge amount of
small files.


On Sun, Feb 28, 2016 at 9:47 PM Michael André Pearce <
michael.andre.pearce@me.com> wrote:

> Hi Lei,
>
> How come in latest versions of hive they achieve and advocate using column
> orientated tables with orc or parquet, and this isn’t suffered as much?
> Isn’t this how some of the more recent performance improvements have even
> been achieved in hive by using such formats as hive.
>
> Surely having columnar tables is more efficient and would bring
> performance benefits to hawq for analytics workloads which is what in my
> experience the key workload of sql users on hadoop.
>
> Using something like ORC files with compactions would also enable HAWQ to
> support transactions e.g. delete and update operations as is now available
> in Hive.
>
> Cheers
> Mike
>
>
>
>
> On 29 Feb 2016, at 01:19, Lei Chang <lei_chang@apache.org> wrote:
>
> Hi, if column oriented tables are not used properly, it may overwhelm hdfs
> since it might lead to too many files. So it is disabled by default.
>
> Cheers
> Lei
>
>
>
> On Sun, Feb 28, 2016 at 10:39 PM, yin.zhb@163.com <yin.zhb@163.com> wrote:
>
>> hi,all:
>>     this days i am testing hawq(1.3.1) ,I got some questions:
>> by default,hawq off the column_orientied_table,why?
>>
>> [gpadmin@stars1 test]$
>> [gpadmin@stars1 test]$ psql -U gpadmin -d hawq -f create_table.sql
>>
>> psql:create_table.sql:48: ERROR:  Column oriented tables are deprecated. To enable
it, set GUC gp_enable_column_oriented_table on.
>> [gpadmin@stars1 test]$ gpconfig -s gp_enabled_column_orientied_table
>>
>> 20160228:21:45:40:026806 gpconfig:stars1:gpadmin-[ERROR]:-Failed to retrieve GUC
information, guc does not exist: gp_enabled_column_orientied_table
>> [gpadmin@stars1 test]$ gpconfig -s gp_enable_column_oriented_table
>> Values on all segments are consistent
>> GUC          : gp_enable_column_oriented_table
>> Master  value: off
>> Segment value: off
>> [gpadmin@stars1 test]$
>>
>> ------------------------------
>> yin.zhb@163.com
>>
>
>
>

Mime
View raw message