kylin-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shaofeng SHI (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KYLIN-3070) Enable 'kylin.source.hive.flat-table-storage-format' for flat table storage format
Date Thu, 24 May 2018 12:03:00 GMT

    [ https://issues.apache.org/jira/browse/KYLIN-3070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16488863#comment-16488863
] 

Shaofeng SHI commented on KYLIN-3070:
-------------------------------------

[~seva_ostapenko] Hello Vsevolod, I'm wondering, after switch from sequence file to parquet
file as the formate for intermediate table, did you observe a performance improvement? As
Kylin's processing for the data is row by row, so I guess changing to Parquet may not benefit;
while the column compression may downgrade the performance. Just want to see if you have
such information.

> Enable 'kylin.source.hive.flat-table-storage-format' for flat table storage format
> ----------------------------------------------------------------------------------
>
>                 Key: KYLIN-3070
>                 URL: https://issues.apache.org/jira/browse/KYLIN-3070
>             Project: Kylin
>          Issue Type: Improvement
>          Components: Job Engine
>    Affects Versions: v2.2.0
>         Environment: HDP 2.5.6, Kylin 2.2.0
>            Reporter: Vsevolod Ostapenko
>            Assignee: Vsevolod Ostapenko
>            Priority: Major
>              Labels: newbie
>             Fix For: v2.3.0
>
>         Attachments: KYLIN-3070.master.001.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Flat table storage format is currently hard-coded as SEQUENCEFILE in the core-job/src/main/java/org/apache/kylin/job/JoinedFlatTable.java
> That prevents using Impala as a SQL engine while using beeline CLI (via custom JDBC URL),
as Impala cannot write sequence files.
> Adding a parameter to kylin.properties to override the default setting would address
the issue.
> Removing a hard-coded value for storage format might be good idea in and on itself.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message