impala-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From tarmstr...@apache.org
Subject [3/4] incubator-impala git commit: IMPALA-6076: Parquet BIT_PACKED deprecation warning
Date Tue, 24 Oct 2017 22:21:02 GMT
IMPALA-6076: Parquet BIT_PACKED deprecation warning

Every 100th time that we open a Parquet column with the
deprecated BIT_PACKED encoding, an error is logged. We do this
per-column instead of per-file because Impala historically
listed the BIT_PACKED encoding in file metadata even when it
wasn't used for any columns - see IMPALA-5636.

Testing:
Manually tested by running a query repeatedly against a
BIT_PACKED sample file (which I created for my IMPALA-4177
patch). Ran "tail -f logs/cluster/impalad.WARNING" and checked
that the warning was logged periodically.

Change-Id: I02dd4009089a264b28376492b1b40361d767d5d9
Reviewed-on: http://gerrit.cloudera.org:8080/8370
Reviewed-by: Lars Volker <lv@cloudera.com>
Tested-by: Impala Public Jenkins


Project: http://git-wip-us.apache.org/repos/asf/incubator-impala/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-impala/commit/c87ad363
Tree: http://git-wip-us.apache.org/repos/asf/incubator-impala/tree/c87ad363
Diff: http://git-wip-us.apache.org/repos/asf/incubator-impala/diff/c87ad363

Branch: refs/heads/master
Commit: c87ad3631a4f3f1854759937ae0f8de63cb6e5dc
Parents: 1640aa9
Author: Tim Armstrong <tarmstrong@cloudera.com>
Authored: Tue Oct 24 10:31:24 2017 -0700
Committer: Impala Public Jenkins <impala-public-jenkins@gerrit.cloudera.org>
Committed: Tue Oct 24 22:11:39 2017 +0000

----------------------------------------------------------------------
 be/src/exec/parquet-column-readers.cc | 7 +++++++
 1 file changed, 7 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/c87ad363/be/src/exec/parquet-column-readers.cc
----------------------------------------------------------------------
diff --git a/be/src/exec/parquet-column-readers.cc b/be/src/exec/parquet-column-readers.cc
index ad12916..6d211a6 100644
--- a/be/src/exec/parquet-column-readers.cc
+++ b/be/src/exec/parquet-column-readers.cc
@@ -46,6 +46,9 @@ DEFINE_bool(convert_legacy_hive_parquet_utc_timestamps, false,
     "When true, TIMESTAMPs read from files written by Parquet-MR (used by Hive) will "
     "be converted from UTC to local time. Writes are unaffected.");
 
+// Throttle deprecation warnings to - only print warning with this frequency.
+static const int BITPACKED_DEPRECATION_WARNING_FREQUENCY = 100;
+
 // Max data page header size in bytes. This is an estimate and only needs to be an upper
 // bound. It is theoretically possible to have a page header of any size due to string
 // value statistics, but in practice we'll have trouble reading string values this large.
@@ -100,6 +103,10 @@ Status ParquetLevelDecoder::Init(const string& filename,
     case parquet::Encoding::BIT_PACKED:
       num_bytes = BitUtil::Ceil(num_buffered_values, 8);
       bit_reader_.Reset(*data, num_bytes);
+      LOG_EVERY_N(WARNING, BITPACKED_DEPRECATION_WARNING_FREQUENCY)
+          << filename << " uses deprecated Parquet BIT_PACKED encoding for rep
or def "
+          << "levels. This will be removed in the future - see IMPALA-6077. Warning
"
+          << "every " << BITPACKED_DEPRECATION_WARNING_FREQUENCY << " occurrences.";
       break;
     default: {
       stringstream ss;


Mime
View raw message