drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From parthchandra <...@git.apache.org>
Subject [GitHub] drill pull request: DRILL-3871: Off by one error while reading bin...
Date Mon, 26 Oct 2015 17:39:16 GMT
GitHub user parthchandra opened a pull request:

    https://github.com/apache/drill/pull/219

    DRILL-3871: Off by one error while reading binary fields with one ter…

    …minal null in parquet.
    
    Changes -
      1) Rewrote the NullableColumnReader.processPages function to process runs of Null values
and Non-Null values without needing to keeping track of whether the previous iteration in
the while loop had encountered a null or not. A pair of loops now iterates over a run of nulls
or a run of non-null values.
      2) Removed some redundant code.
      3) Renamed some variables. The indexInOutputVector is now replaced by two local variables,
readCount and writeCount only for clarity.
     4) Adding tracing.
     5) Added unit tests for edge cases of nulls occurring on page boundaries. 
    
    For all the unit tests, tpch-h and tpc-ds test data sets, the state of the NullableColumnReader
at the end of each iteration of processPages is identical to the old code. In addition the
boundary conditions are taken care of.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/parthchandra/incubator-drill DRILL-3871

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/drill/pull/219.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #219
    
----
commit d23ceb2a4c32da9535f1e482c4c70fcc31b8b2b8
Author: Parth Chandra <parthc@apache.org>
Date:   2015-10-05T17:25:56Z

    DRILL-3871: Off by one error while reading binary fields with one terminal null in parquet.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

Mime
View raw message