impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Armstrong (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-5448: fix invalid number of splits reported in Parquet scan node
Date Mon, 02 Oct 2017 21:47:57 GMT
Tim Armstrong has posted comments on this change. ( )

Change subject: IMPALA-5448: fix invalid number of splits reported in Parquet scan node

Patch Set 2:

File be/src/exec/hdfs-scan-node-base.h:
PS2, Line 557: __builtin_popcount
Should call BitUtil::Popcount(), which will use hardware acceleration if appropriate.
PS2, Line 579: bit_map
We put an underscore at the end of private members, i.e. 'bit_map_'
PS2, Line 582:   /// Mapping of file formats (file type, compression types set) to the number
Not your change, but it should mention the second entry in the tuple - whether the split was
File testdata/datasets/functional/functional_schema_template.sql:
PS2, Line 1581: -- IMPALA-5448: parquet files with multiple compression types
We moved to loading "special" files as part of the tests rather than part of the data loading
in a lot of cases. I think that is better practically because if you change this template
then everyone has to reload data.

I commented on an instance of the alternative approach that we should switch to.
File tests/query_test/
PS2, Line 82:   def test_hdfs_parquet_scan_node_profile(self, vector):
This only applies to parquet so should go in TestParquet below (TestScannersAllTableFormats
runs the test for all table formats).
PS2, Line 337:   def test_corrupt_rle_counts(self, vector, unique_database):
This is an example of the alternative way of loading data files as part of the test.

To view, visit
To unsubscribe, visit

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaacc2d775032f5707061e704f12e0a63cde695d1
Gerrit-Change-Number: 8147
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang <>
Gerrit-Reviewer: Quanlong Huang <>
Gerrit-Reviewer: Tim Armstrong <>
Gerrit-Comment-Date: Mon, 02 Oct 2017 21:47:57 +0000
Gerrit-HasComments: Yes

  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message