hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From no jihun <jees...@gmail.com>
Subject Re: ORC file sort order ..
Date Sun, 10 Apr 2016 12:03:00 GMT
You can enforce to insert sorted data into *SORTED BY *table by set
hive.enforce.sorting=true
https://github.com/apache/hive/blob/branch-1.2/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L1131

but this configuration seems removed by 2.0
https://issues.apache.org/jira/browse/HIVE-12331

2016-04-10 1:41 GMT+09:00 Mich Talebzadeh <mich.talebzadeh@gmail.com>:

> Have you tried bucketing by the column plus setting orce,create.index and
> orc.bloom.filter.columns
>
> CREATE TABLE dummy (
>      ID INT
>    , CLUSTERED INT
>    , SCATTERED INT
>    , RANDOMISED INT
>    , RANDOM_STRING VARCHAR(50)
>    , SMALL_VC VARCHAR(10)
>    , PADDING  VARCHAR(10)
> )
>
> *CLUSTERED BY (ID) INTO 256 BUCKETS*STORED AS ORC
> TBLPROPERTIES (
>
>
> *"orc.create.index"="true","orc.bloom.filter.columns"="ID","*
> orc.bloom.filter.fpp"="0.05",
> "orc.compress"="SNAPPY",
> "orc.stripe.size"="16777216",
> "orc.row.index.stride"="10000" )
> ;
>
>
> HTH
>
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>
> On 9 April 2016 at 01:53, Gautam <gautamkowshik@gmail.com> wrote:
>
>> Hey,
>>
>>            This might be too obvious a question but I haven't found a way
>> to validate ordering in an ORC file. I need each file to be ordered by a
>> column, Is there a sure shot way of ensuring the sort order in an ORC file
>> is as I expect it?
>>
>> The closest i'v come to is using the hive --orcfiledump --rowindex
>> <col_id> which prints that columns min/max values in the index. But that is
>> still not saying if the data within the stripes is sorted.
>>
>> Cheers,
>> -Gautam.
>>
>
>


-- 
----------------------------------------------
Jihun No ( 노지훈 )
----------------------------------------------
Twitter          : @nozisim
Facebook       : nozisim
Website         : http://jeesim2.godohosting.com
---------------------------------------------------------------------------------
Market Apps   : android market products.
<https://market.android.com/developer?pub=%EB%85%B8%EC%A7%80%ED%9B%88>

Mime
View raw message