impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Attila Jeges (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-3079: Fix sequence file writer
Date Fri, 10 Mar 2017 16:12:11 GMT
Attila Jeges has posted comments on this change.

Change subject: IMPALA-3079: Fix sequence file writer

Patch Set 2:

File be/src/exec/

PS1, Line 88:  
> nit: blank space.
File be/src/exec/read-write-util.h:

PS1, Line 217: ~val;
> Did you mean to convert this to the absolute value ? If so, you are missing
It should be ~val.

PS1, Line 219: 
             : inline int64_t ReadWriteUtil::PutVLong(int64_t val, uint8_t* buf) {
             :   int64_t num_bytes = VLongRequiredBytes(val);
             :   if (num_bytes == 1) {
             :     // store the value itself instead of the
> Typo. I meant bit scan from the MSB. __builtin_clzl()

PS1, Line 240:     buf[i] = (val >> (8 * (num_bytes - i - 1))) & 0xFF;
             :   }
> Is this behavior documented somewhere ? Is it part of the sequence file sta
Yes, it is part of the sequence file standard and it can be read back by Hive. I've added
an E2E test to confirm this.

Sequence file documentation:

The Zero-Compressed Integer format is described here:,

The details are in the actual source code (WriteVLong() method):

PS1, Line 246: Util::PutVInt(int32
> Big Endianness ?
Correct. Changed the comment.
File be/src/util/

PS1, Line 209:  each preceded with an integer for
> , each preceded with an integer for its size.

PS1, Line 210: // For testing purposes we are going to generate two blocks if input_length
>= 4K
> That seems a bit inefficient to split into two blocks for input_length == 2
Done,changed the threshold to 4K

PS1, Line 235: ReadWriteUtil::
> The declaration can be moved into the loop body.
File testdata/workloads/functional-query/queries/QueryTest/seq-writer.test:

> what about the other codecs?
This test case runs for 8.5 seconds.

I've added test cases for other codecs with RECORD/BLOCK compression mode.

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: I0db642ad35132a9a5a6611810a6cafbbe26e7487
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Attila Jeges <>
Gerrit-Reviewer: Attila Jeges <>
Gerrit-Reviewer: Marcel Kornacker <>
Gerrit-Reviewer: Michael Ho <>
Gerrit-HasComments: Yes

View raw message