drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arina Ielchiieva (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (DRILL-5830) Resolve regressions to MapR DB from DRILL-5546
Date Tue, 10 Oct 2017 14:49:01 GMT

     [ https://issues.apache.org/jira/browse/DRILL-5830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Arina Ielchiieva updated DRILL-5830:
    Reviewer: Arina Ielchiieva

> Resolve regressions to MapR DB from DRILL-5546
> ----------------------------------------------
>                 Key: DRILL-5830
>                 URL: https://issues.apache.org/jira/browse/DRILL-5830
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.12.0
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>             Fix For: 1.12.0
> DRILL-5546 added a number of fixes for empty batches. One part of the fix was for HBase.
Key changes:
> * Add code to expand wildcards in the planner. (i.e. SELECT *)
> * Remove support for wildcards in the HBase record reader.
> As noted in DRILL-5775, this change had the effect of breaking support for MapR-DB binary
(which is API compatible with HBase.) DRILL-5775 does this by expanding wildcards in the planner
for MapR DB as was done for HBase in DRILL-5546.
> Unfortunately, this change introduced other regressions into the code as described by
> Investigation of those issues revealed that we should back out the original DRILL-5546
changes and go down a different route.
> As it turns out, HBase already had a project push-down rule that expanded wildcards.
However, that rule didn't work correctly some of the time. DRILL-5546 fixed that bug, ensuring
that wildcards are expanded (at least in the cases tested for this ticket.)
> The actual issue turned out to be a bug in the {{RecordBatchLoader}} class which did
not consider map contents when detecting schema change. As a result, results like (row_key,
cf\{}) were treated the same as (row_key, cf\{mycol}) and the actual data colums were discarded,
but randomly depending on batch arrival order.

This message was sent by Atlassian JIRA

View raw message