impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Skye Wanderman-Milne (Code Review)" <ger...@cloudera.org>
Subject [Impala-CR](cdh5-trunk) IMPALA-3038: Add multistream gzip/bzip2 test coverage
Date Thu, 24 Mar 2016 17:31:25 GMT
Skye Wanderman-Milne has posted comments on this change.

Change subject: IMPALA-3038: Add multistream gzip/bzip2 test coverage
......................................................................


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/2543/7/be/src/util/decompress-test.cc
File be/src/util/decompress-test.cc:

Line 255:     // Repeatedly pick random-size input data(~1MB), compress it, then concatenate
> I try to simulate pbzip2, it split large input into smaller chunks then com
Ah ok, sorry I keep getting confused about the sizes. Makes sense though.

Can you include some of this reasoning in the comment? I think it's useful to know this is
an approximation of pbzip2's behavior, that you want at least 8MB of comprssed data to be
bigger than the decompressor buffer size (actually unrelated to the IO buffer size I think),
and that you expect ~2:1 compression ratio.


-- 
To view, visit http://gerrit.cloudera.org:8080/2543
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9b0e1971145dd457e71fc9c00ce7c06fff8dea88
Gerrit-PatchSet: 7
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Juan Yu <jyu@cloudera.com>
Gerrit-Reviewer: Juan Yu <jyu@cloudera.com>
Gerrit-Reviewer: Skye Wanderman-Milne <skye@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message