drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steven Phillips (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (DRILL-1781) For complex functions, don't return until schema is known
Date Tue, 02 Dec 2014 13:55:12 GMT

     [ https://issues.apache.org/jira/browse/DRILL-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Steven Phillips resolved DRILL-1781.
------------------------------------
    Resolution: Fixed

Fixed by 3581a32

> For complex functions, don't return until schema is known
> ---------------------------------------------------------
>
>                 Key: DRILL-1781
>                 URL: https://issues.apache.org/jira/browse/DRILL-1781
>             Project: Apache Drill
>          Issue Type: Bug
>            Reporter: Steven Phillips
>            Priority: Blocker
>             Fix For: 0.7.0
>
>         Attachments: DRILL-1781.patch, DRILL-1781.patch
>
>
> In the case of complex output functions, it is impossible to determine the output schema
until the actual data is consumed. For example, with convert_form(VARCHAR, 'json'), unlike
most other functions, it is not sufficient to know that the incoming data type is VARCHAR,
we actually need to decode the contents of the record before we can determine what the output
type is, whether it be map, list, or primitive type.
> For fast schema return, we worked around this problem by simply assuming the type was
Map, and if it happened to be different, there would be a schema change. This solution is
not satisfactory, as it ends up breaking other functions, like flatten.
> The solution is to continue returning a schema whenever possible, but when it is not
possible, drill will wait until it is.
> For non-blocking operators, drill will immediately consume the incoming batch, and thus
will not return empty schema batches if there is data to consume. Blocking operators will
return an empty schema batch. If a flattten function occurs downstream from a blocking operator,
it will not be able to return a schema, and thus fast schema return will not happen in this
case.
> In the cases where the complex function is not downstream from a blocking operator, fast
schema return should continue to work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message