hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "slim bouguerra (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-18780) Improve schema discovery For Druid Storage Handler
Date Mon, 19 Mar 2018 23:02:00 GMT

     [ https://issues.apache.org/jira/browse/HIVE-18780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

slim bouguerra updated HIVE-18780:
----------------------------------
    Attachment: HIVE-18780.patch

> Improve schema discovery For Druid Storage Handler
> --------------------------------------------------
>
>                 Key: HIVE-18780
>                 URL: https://issues.apache.org/jira/browse/HIVE-18780
>             Project: Hive
>          Issue Type: Improvement
>          Components: Druid integration
>            Reporter: slim bouguerra
>            Assignee: slim bouguerra
>            Priority: Major
>             Fix For: 3.0.0
>
>         Attachments: HIVE-18780.patch, HIVE-18780.patch
>
>
> Currently, Druid Storage adapter issues a Segment metadata Query every time the query
is of type Select or Scan. Not only that but then every input split (map) will do the same
as well since it is using the same Serde, this is very expensive and put a lot of pressure
on the Druid Cluster. The way to fix this is to add the schema out of the calcite plan instead
of serializing the query itself as part of the Hive query context.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message