drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-5385) Vector serializer fails to read saved SV2
Date Fri, 28 Apr 2017 23:35:04 GMT

    [ https://issues.apache.org/jira/browse/DRILL-5385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15989582#comment-15989582
] 

ASF GitHub Bot commented on DRILL-5385:
---------------------------------------

Github user Ben-Zvi commented on a diff in the pull request:

    https://github.com/apache/drill/pull/800#discussion_r114036665
  
    --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/cache/VectorAccessibleSerializable.java
---
    @@ -146,36 +157,24 @@ public void writeToStream(OutputStream output) throws IOException
{
         final DrillBuf[] incomingBuffers = batch.getBuffers();
         final UserBitShared.RecordBatchDef batchDef = batch.getDef();
     
    -    /* DrillBuf associated with the selection vector */
    -    DrillBuf svBuf = null;
    -    Integer svCount =  null;
    -
    -    if (svMode == BatchSchema.SelectionVectorMode.TWO_BYTE) {
    -      svCount = sv2.getCount();
    -      svBuf = sv2.getBuffer(); //this calls retain() internally
    -    }
    -
         try {
           /* Write the metadata to the file */
           batchDef.writeDelimitedTo(output);
     
           /* If we have a selection vector, dump it to file first */
    -      if (svBuf != null) {
    -        allocator.write(svBuf, output);
    -        sv2.setBuffer(svBuf);
    -        svBuf.release(); // sv2 now owns the buffer
    -        sv2.setRecordCount(svCount);
    +      if (svMode == BatchSchema.SelectionVectorMode.TWO_BYTE) {
    +        recordCount = sv2.getCount();
    --- End diff --
    
    Ultra minor comment: When SV2 is read, the recordCount comes from the batchDef; here it
is taken from the sv2 itself (they should be the same anyway ....)


> Vector serializer fails to read saved SV2
> -----------------------------------------
>
>                 Key: DRILL-5385
>                 URL: https://issues.apache.org/jira/browse/DRILL-5385
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.8.0
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>              Labels: ready-to-commit
>             Fix For: 1.11.0
>
>
> Drill provides the {{VectorAccessibleSerializable}} class to write a record batch to
a stream, and to read that batch from a stream. Record batches can carry an indirection vector
(a so-called selection vector 2 or SV2).
> The code to write batches writes the SV2 to the stream. But, the code to deserialize
batches initializes, but does not read, the SV2 from the stream.
> The result is that vector deserialization reads the wrong bytes and the saved values
are corrupted on read.
> Note that this issue was found via unit testing. At present, the only production use
of this code is in the external sort, which serializes batches without an indirection vector.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message