impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dan Hecht (Code Review)" <>
Subject [Impala-CR](cdh5-2.5.0_5.7.0) IMPALA-1886/IMPALA-2154: Add support for multi-stream bz2/gzip compressed files.
Date Fri, 26 Feb 2016 02:10:43 GMT
Dan Hecht has posted comments on this change.

Change subject: IMPALA-1886/IMPALA-2154: Add support for multi-stream bz2/gzip compressed

Patch Set 14:

File be/src/exec/

Line 484:   }
L469-484 should be restructured as:

if (stream_->eosr()) {
   if (stream_end) {
      *eosr = true;
   } else {
       return Status(truncated);
} else if (*decompressed_len == 0) {
   return Status(NO_PRORESS);
File be/src/util/codec.h:

Line 130:   ///   stream_end: if decompressor consumed all input and reached end of compressed
I think we should remove the "if decompressor consumed all input" condition, and this should
just signify that the end of the 'output' buffer corresponds to the end of a compression stream
(and then see my comments in the code).  The caller should (and already does check) that there
is no more input via the stream_->eosr() check.
File be/src/util/

Line 89:   *stream_end = false;
move this to L91 inside the loop.  See comment at line 113 for why.

Line 113:       if (stream_.avail_in == 0) *stream_end = true;
As mentioned elsewhere, let's simplify this and just uncondtionally set *stream_end = true
here.  The avail_in check isn't interesting, since only the caller knows whether there really
is more input or not.

Line 342:   *stream_end = false;
same comments as above

Line 352:       if (stream_.avail_in == 0) *stream_end = true;
and here
File common/thrift/

Line 226: decompressed
File testdata/workloads/functional-query/queries/DataErrorsTest/hdfs-scan-node-errors.test:

Line 194: decompressed
this should say "compressed" file.  From the users perspective, it's the compressed file that
was malformed.

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: Icbe617d03a69953f0bf3aa0f7c30d34bc612f9f8
Gerrit-PatchSet: 14
Gerrit-Project: Impala
Gerrit-Branch: cdh5-2.5.0_5.7.0
Gerrit-Owner: Juan Yu <>
Gerrit-Reviewer: Dan Hecht <>
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Juan Yu <>
Gerrit-Reviewer: Skye Wanderman-Milne <>
Gerrit-HasComments: Yes

View raw message