drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul Rogers (JIRA)" <j...@apache.org>
Subject [jira] [Assigned] (DRILL-5021) ExternalSortBatch redundantly redefines the batch schema
Date Mon, 10 Jul 2017 16:33:00 GMT

     [ https://issues.apache.org/jira/browse/DRILL-5021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Paul Rogers reassigned DRILL-5021:

    Assignee:     (was: Paul Rogers)

> ExternalSortBatch redundantly redefines the batch schema
> --------------------------------------------------------
>                 Key: DRILL-5021
>                 URL: https://issues.apache.org/jira/browse/DRILL-5021
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.8.0
>            Reporter: Paul Rogers
>            Priority: Minor
> Much code in the {{ExternalSortBatch}} (ESB) deals with building vector batches and schemas.
However, ESB cannot handle schema changes. The only valid schema difference is the same field
path in a different position in the vector array. Given this restriction, the code can be
simplified (and sped up) by exploiting the fact that all batches are required to have the
same conceptual schema (same set of fields, but perhaps in different vector order) and most
probably, the same physical schema (same fields and same vector order.) Note that, because
of the way that the {{getValueVectorId()}} method works, each lookup of a value vector is
an O\(n) operation, so that each remapping of vectors is O(n\^2).

This message was sent by Atlassian JIRA

View raw message