impala-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jruss...@apache.org
Subject incubator-impala git commit: IMPALA-3316: [DOCS] Add known issue for timezone conversion slowdown
Date Fri, 06 Oct 2017 04:45:35 GMT
Repository: incubator-impala
Updated Branches:
  refs/heads/master ec957456d -> 1e581a66d


IMPALA-3316: [DOCS] Add known issue for timezone conversion slowdown

Change-Id: I9933ced07e339d589f7f74173cfebe938084e65c
Reviewed-on: http://gerrit.cloudera.org:8080/8165
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Reviewed-by: Alex Behm <alex.behm@cloudera.com>
Tested-by: Impala Public Jenkins


Project: http://git-wip-us.apache.org/repos/asf/incubator-impala/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-impala/commit/1e581a66
Tree: http://git-wip-us.apache.org/repos/asf/incubator-impala/tree/1e581a66
Diff: http://git-wip-us.apache.org/repos/asf/incubator-impala/diff/1e581a66

Branch: refs/heads/master
Commit: 1e581a66dddae5b400e50e440063d16de868bb63
Parents: ec95745
Author: John Russell <jrussell@cloudera.com>
Authored: Thu Sep 28 10:36:39 2017 -0700
Committer: Impala Public Jenkins <impala-public-jenkins@gerrit.cloudera.org>
Committed: Fri Oct 6 04:42:15 2017 +0000

----------------------------------------------------------------------
 docs/topics/impala_known_issues.xml | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/1e581a66/docs/topics/impala_known_issues.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_known_issues.xml b/docs/topics/impala_known_issues.xml
index 14ff4e3..28196f5 100644
--- a/docs/topics/impala_known_issues.xml
+++ b/docs/topics/impala_known_issues.xml
@@ -305,6 +305,32 @@ https://issues.apache.org/jira/browse/IMPALA-2144 - Don't have
 
     </conbody>
 
+    <concept id="IMPALA-3316">
+      <title>Slow queries for Parquet tables with convert_legacy_hive_parquet_utc_timestamps=true</title>
+      <conbody>
+        <p>
+          The configuration setting <codeph>convert_legacy_hive_parquet_utc_timestamps=true</codeph>
+          uses an underlying function that can be a bottleneck on high volume, highly concurrent
+          queries due to the use of a global lock while loading time zone information. This
bottleneck
+          can cause slowness when querying Parquet tables, up to 30x for scan-heavy queries.
The amount
+          of slowdown depends on factors such as the number of cores and number of threads
involved in the query.
+        </p>
+        <note>
+          <p>
+            The slowdown only occurs when accessing <codeph>TIMESTAMP</codeph>
columns within Parquet files that
+            were generated by Hive, and therefore require the on-the-fly timezone conversion
processing.
+          </p>
+        </note>
+        <p><b>Bug:</b> <xref keyref="IMPALA-3316">IMPALA-3316</xref></p>
+        <p><b>Severity:</b> High</p>
+        <p><b>Workaround:</b> If the <codeph>TIMESTAMP</codeph>
values stored in the table represent dates only,
+          with no time portion, consider storing them as strings in <codeph>yyyy-MM-dd</codeph>
format.
+          Impala implicitly converts such string values to <codeph>TIMESTAMP</codeph>
in calls to date/time
+          functions.
+        </p>
+      </conbody>
+    </concept>
+
     <concept id="IMPALA-1480" rev="IMPALA-1480">
 
 <!-- Not part of Alex's spreadsheet. Spreadsheet has IMPALA-1423 which mentions it's similar
to this one but not a duplicate. -->


Mime
View raw message