drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-6080) Sort incorrectly limits batch size to 65535 records rather than 65536
Date Sat, 13 Jan 2018 03:05:00 GMT

    [ https://issues.apache.org/jira/browse/DRILL-6080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16324927#comment-16324927
] 

ASF GitHub Bot commented on DRILL-6080:
---------------------------------------

GitHub user paul-rogers opened a pull request:

    https://github.com/apache/drill/pull/1090

    DRILL-6080: Sort incorrectly limits batch size to 65535 records

    Sort incorrectly limits batch size to 65535 records rather than 65536.
    
    This PR also includes a few code cleanup items.
    
    Also fixes DRILL-6086: Overflow in offset vector in row set writer
    
    @vrozov, please review.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/paul-rogers/drill DRILL-6080

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/drill/pull/1090.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1090
    
----
commit c1d3402a619f3355e47e845aae245fd0f96e2189
Author: Paul Rogers <progers@...>
Date:   2018-01-11T00:04:53Z

    DRILL-6080: Sort incorrectly limits batch size to 65535 records
    
    Sort incorrectly limits batch size to 65535 records rather than 65536.
    
    This PR also includes a few code cleanup items.
    
    Fix for overflow in offset vector in row set writer

----


> Sort incorrectly limits batch size to 65535 records rather than 65536
> ---------------------------------------------------------------------
>
>                 Key: DRILL-6080
>                 URL: https://issues.apache.org/jira/browse/DRILL-6080
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.12.0
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>            Priority: Minor
>             Fix For: 1.13.0
>
>
> Drill places an upper limit on the number of rows in a batch of 64K. That is 65,536 decimal.
When we index records, the indexes run from 0 to 64K-1 or 0 to 65,535.
> The sort code incorrectly uses {{Character.MAX_VALUE}} as the maximum row count. So,
if an incoming batch uses the full 64K size, sort ends up splitting batches unnecessarily.
> The fix is to instead use the correct constant `ValueVector.MAX_ROW_COUNT`.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message