impala-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sail...@apache.org
Subject [3/3] incubator-impala git commit: IMPALA-4939, IMPALA-4940: Decimal V2 multiplication
Date Wed, 04 Oct 2017 17:10:42 GMT
IMPALA-4939, IMPALA-4940: Decimal V2 multiplication

Implement the new DECIMAL return type rules for multiply expressions,
active when query option DECIMAL_V2=1. The algorithm for determining
the type of the result of multiplication is described in the JIRA.

DECIMAL V1:

+-----------------------------------------------------------------------+
| typeof(cast('0.1' as decimal(38,38)) * cast('0.1' as decimal(38,38))) |
+-----------------------------------------------------------------------+
| DECIMAL(38,38)                                                        |
+-----------------------------------------------------------------------+

+-----------------------------------------------------------------------+
| typeof(cast('0.1' as decimal(38,15)) * cast('0.1' as decimal(38,15))) |
+-----------------------------------------------------------------------+
| DECIMAL(38,30)                                                        |
+-----------------------------------------------------------------------+

DECIMAL V2:

+-----------------------------------------------------------------------+
| typeof(cast('0.1' as decimal(38,38)) * cast('0.1' as decimal(38,38))) |
+-----------------------------------------------------------------------+
| DECIMAL(38,37)                                                        |
+-----------------------------------------------------------------------+

+-----------------------------------------------------------------------+
| typeof(cast('0.1' as decimal(38,15)) * cast('0.1' as decimal(38,15))) |
+-----------------------------------------------------------------------+
| DECIMAL(38,6)                                                         |
+-----------------------------------------------------------------------+

In this patch, we also fix the early multiplication overflow. We compute
a 256 bit integer intermediate value, which we then attempt to scale down
and round.

Performance:

I ran TPCH 300 and TPCDS 1000 workloads and the performance is almost
identical. For TPCH Q1, there was an improvement from 21 seconds to 16
seconds. I did not see any regressions.

The performance improvement is due to the way we check for overflows
after this patch (by counting the leading zeros instead of dividing).
It can be clealy seen in this query:
  select cast(2.2 as decimal(38, 1)) * cast(2.2 as decimal(38, 1))
  before: 7.85s
  after:  2.03s

I noticed performance regressions in the following cases:
- When we need to convert to a 256 bit integer before multiplying,
  which was introduced in this patch. Whether this happens depends on
  the resulting precision and the value of the inputs. In the following
  extreme case, the intermediate value is converted to a 256 bit integer
  every time.

  select cast(1.1 as decimal(38, 37)) * cast(1.1 as decimal(38, 37))
  before: 14.56s (returns null)
  after:  126.17s

- When we need to scale down the intermediate value. In the following
  query the result is decimal(38,6) after the patch, so the
  intermediate needs to be scaled down.

  select cast(2.2 as decimal(38,1)) * cast(2.2 as decimal(38,19))
  before: 7.25s
  after:  13.06s

These regressions are possible only when the resulting precision is 38
which is not common in typical workloads.

Note: The actual queries that I ran for the benchmark are not exactly as
  above. I constructed tables with millions of rows with those values. I
  ran the queries with DECIMAL_v2=1 option before and after the patch.

Change-Id: I37ad6232d7953bd75c18dc86e665b2b501a1ebe1
Reviewed-on: http://gerrit.cloudera.org:8080/7438
Reviewed-by: Taras Bobrovytsky <tbobrovytsky@cloudera.com>
Tested-by: Impala Public Jenkins


Project: http://git-wip-us.apache.org/repos/asf/incubator-impala/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-impala/commit/62596410
Tree: http://git-wip-us.apache.org/repos/asf/incubator-impala/tree/62596410
Diff: http://git-wip-us.apache.org/repos/asf/incubator-impala/diff/62596410

Branch: refs/heads/master
Commit: 6259641077250dcd3360e2e6c2bf9023b201d858
Parents: ab0bc9e
Author: Taras Bobrovytsky <tbobrovytsky@cloudera.com>
Authored: Thu Jul 6 10:43:34 2017 -0700
Committer: Impala Public Jenkins <impala-public-jenkins@gerrit.cloudera.org>
Committed: Wed Oct 4 09:37:36 2017 +0000

----------------------------------------------------------------------
 be/src/exprs/expr-test.cc                       | 249 ++++++++++++++++++-
 be/src/runtime/decimal-value.inline.h           | 113 +++++----
 be/src/util/bit-util.h                          |  18 +-
 .../org/apache/impala/analysis/TypesUtil.java   |   7 +-
 .../impala/analysis/AnalyzeExprsTest.java       |   6 +-
 5 files changed, 332 insertions(+), 61 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/62596410/be/src/exprs/expr-test.cc
----------------------------------------------------------------------
diff --git a/be/src/exprs/expr-test.cc b/be/src/exprs/expr-test.cc
index 2065c80..bf7808d 100644
--- a/be/src/exprs/expr-test.cc
+++ b/be/src/exprs/expr-test.cc
@@ -1500,6 +1500,59 @@ struct DecimalTestCase {
   DecimalExpectedResult expected[2];
 };
 
+// Utility function to construct int128 types.
+int128_t StringToInt128(string s) {
+  int128_t result = 0;
+  int sign = 1;
+  int digit = 0;
+  for (int i = 0; i < s.length(); ++i) {
+    switch (s[i]) {
+      case '-':
+        digit = -1;
+        break;
+      case '0':
+        digit = 0;
+        break;
+      case '1':
+        digit = 1;
+        break;
+      case '2':
+        digit = 2;
+        break;
+      case '3':
+        digit = 3;
+        break;
+      case '4':
+        digit = 4;
+        break;
+      case '5':
+        digit = 5;
+        break;
+      case '6':
+        digit = 6;
+        break;
+      case '7':
+        digit = 7;
+        break;
+      case '8':
+        digit = 8;
+        break;
+      case '9':
+        digit = 9;
+        break;
+      default:
+        DCHECK(false) << "Unexpected character.";
+    }
+    if (digit == -1) {
+      DCHECK_EQ(sign, 1) << "Minus symbol appears multiple times.";
+      sign = -1;
+    } else {
+      result = result * 10 + digit;
+    }
+  }
+  return sign * result;
+}
+
 // Format is:
 // { Test Expression (as a string),
 //  { expected null, scaled_val, precision, scale for V1
@@ -1512,8 +1565,198 @@ DecimalTestCase decimal_cases[] = {
   { "cast(1.23 as decimal(30,2)) - cast(1 as decimal(4,3))", {{ false, 230, 32, 3 }}},
   { "cast(1 as decimal(38,0)) + cast(.2 as decimal(38,1))", {{ false, 12, 38, 1 }}},
   // Test multiply operator
-  { "cast(1.23 as decimal(30,2)) * cast(1 as decimal(10,3))", {{ false, 123000, 38, 5 }}},
-  { "1.23 * cast(1 as decimal(20,3))", {{ false, 123000, 23, 5 }}},
+  { "cast(1.23 as decimal(30,2)) * cast(1 as decimal(10,3))",
+    {{ false, 123000, 38, 5 }}},
+  { "cast(1.23 as decimal(3,2)) * cast(1 as decimal(20,3))",
+    {{ false, 123000, 23, 5 },
+     { false, 123000, 24, 5 }}},
+  { "cast(0.1 as decimal(20,20)) * cast(1 as decimal(20,19))",
+    {{ false, StringToInt128("10000000000000000000000000000000000000"), 38, 38 },
+     { false, StringToInt128("100000000000000000000000000000000000"), 38, 36 }}},
+  { "cast(111.22 as decimal(5,2)) * cast(3333.444 as decimal(7,3))",
+    {{ false, 37074564168, 12, 5 },
+     { false, 37074564168, 13, 5 }}},
+  { "cast(0.01 as decimal(38,38)) * cast(25 as decimal(38,0))",
+    {{ false, StringToInt128("25000000000000000000000000000000000000"), 38, 38 },
+     { false, 250000, 38, 6 }}},
+  { "cast(-0.01 as decimal(38,38)) * cast(25 as decimal(38,0))",
+    {{ false, StringToInt128("-25000000000000000000000000000000000000"), 38, 38 },
+     { false, -250000, 38, 6 }}},
+  { "cast(-0.01 as decimal(38,38)) * cast(-25 as decimal(38,0))",
+    {{ false, StringToInt128("25000000000000000000000000000000000000"), 38, 38 },
+     { false, 250000, 38, 6 }}},
+  { "cast(0.1 as decimal(38,38)) * cast(25 as decimal(38,0))",
+    {{ true, 0, 38, 38 },
+     { false, 2500000, 38, 6 }}},
+  { "cast(-0.1 as decimal(38,38)) * cast(25 as decimal(38,0))",
+    {{ true, 0, 38, 38 },
+     { false, -2500000, 38, 6 }}},
+  { "cast(-0.1 as decimal(38,38)) * cast(-25 as decimal(38,0))",
+    {{ true, 0, 38, 38 },
+     { false, 2500000, 38, 6 }}},
+  { "cast(9999999999999999.9999 as decimal(20,4)) * "
+      "cast(9999999999999999.994 as decimal(19,3))",
+    {{ true, 0, 38, 7 },
+     { false, StringToInt128("99999999999999999939000000000000000001"), 38, 6 }}},
+  { "cast(9.99999 as decimal(6,5)) * "
+      "cast(9999999999999999999999999999999.94 as decimal(33,2))",
+    {{ true, 0, 38, 7 },
+     { false, StringToInt128("99999899999999999999999999999999400001"), 38, 6 }}},
+  { "cast(0 as decimal(38,38)) * cast(1 as decimal(5,2))",
+    {{ false, 0, 38, 38 },
+     { false, 0, 38, 34 }}},
+  { "cast(12345.67 as decimal(7,2)) * cast(12345.67 as decimal(7,2))",
+    {{ false, 1524155677489, 14, 4 },
+     { false, 1524155677489, 15, 4 }}},
+  { "cast(2643918831543678.5772617359442611897419 as decimal(38,22)) * "
+      "cast(3972211379387512.6946776728996748717839 as decimal(38,22))",
+    {{ true, 0, 38, 38 },
+     { false, StringToInt128("10502204468834736291038133384309593605"), 38, 6 }}},
+  { "cast(-2643918831543678.5772617359442611897419 as decimal(38,22)) * "
+      "cast(3972211379387512.6946776728996748717839 as decimal(38,22))",
+    {{ true, 0, 38, 38 },
+     { false, StringToInt128("-10502204468834736291038133384309593605"), 38, 6 }}},
+  { "cast(-2643918831543678.5772617359442611897419 as decimal(38,22)) * "
+      "cast(-3972211379387512.6946776728996748717839 as decimal(38,22))",
+    {{ true, 0, 38, 38 },
+     { false, StringToInt128("10502204468834736291038133384309593605"), 38, 6 }}},
+  { "cast(2545664579818579.6268123468146829994472 as decimal(38,22)) * "
+      "cast(8862165565622381.6689679519457799681439 as decimal(38,22))",
+    {{ true, 0, 38, 38 },
+     { false, StringToInt128("22560100980892785285767018366785939200"), 38, 6 }}},
+  { "cast(2575543652687181.7412422395638291836214 as decimal(38,22)) * "
+      "cast(9142529549684737.3986798331295277312253 as decimal(38,22))",
+    {{ true, 0, 38, 38 },
+     { false, StringToInt128("23546983951195523383767823614338114083"), 38, 6 }}},
+  { "cast(6188696551164477.9944646378584981371524 as decimal(38,22)) * "
+      "cast(9234914917975734.1781526147879384775182 as decimal(38,22))",
+    {{ true, 0, 38, 38 },
+     { false, StringToInt128("57152086103173814294786121078946977968"), 38, 6 }}},
+  { "cast(-6188696551164477.9944646378584981371524 as decimal(38,22)) * "
+      "cast(9234914917975734.1781526147879384775182 as decimal(38,22))",
+    {{ true, 0, 38, 38 },
+     { false, StringToInt128("-57152086103173814294786121078946977968"), 38, 6 }}},
+  { "cast(1.006 as decimal(4,2)) * cast(1.1 as decimal(4,2))",
+    {{ false, 11000, 8, 4 },
+     { false, 11110, 9, 4 }}},
+  { "cast(0.9994 as decimal(38,4)) * cast(0.999 as decimal(38,3))",
+    {{ false, 9984006, 38, 7 },
+     { false, 998401, 38, 6 }}},
+  { "cast(0.000000000000000000000000000000000001 as decimal(38,38)) * "
+      "cast(0.000000000000000000000000000000000001 as decimal(38,38))",
+    {{ false, 0, 38, 38 },
+     { false, 0, 38, 37 }}},
+  { "cast(0.00000000000000000000000000000000001 as decimal(38,37)) * "
+      "cast(0.00000000000000000000000000000000001 as decimal(38,37))",
+    {{ false, 0, 38, 38 },
+     { false, 0, 38, 35 }}},
+  { "cast(0.00000000000000001 as decimal(38,37)) * "
+      "cast(0.000000000000000001 as decimal(38,37))",
+    {{ false, 1000, 38, 38 },
+     { false, 1, 38, 35 }}},
+  { "cast(9e-18 as decimal(38,38)) * cast(1e-21 as decimal(38,38))",
+    {{ false, 0, 38, 38 },
+     { false, 0, 38, 37 }}},
+  { "cast(1e-37 as decimal(38,38)) * cast(0.1 as decimal(38,38))",
+    {{ false, 1, 38, 38 },
+     { false, 0, 38, 37 }}},
+  { "cast(1e-37 as decimal(38,38)) * cast(-0.1 as decimal(38,38))",
+    {{ false, -1, 38, 38 },
+     { false, 0, 38, 37 }}},
+  { "cast(9e-36 as decimal(38,38)) * cast(0.1 as decimal(38,38))",
+    {{ false, 90, 38, 38 },
+     { false, 9, 38, 37 }}},
+  { "cast(9e-37 as decimal(38,38)) * cast(0.1 as decimal(38,38))",
+    {{ false, 9, 38, 38 },
+     { false, 1, 38, 37 }}},
+  // We are multiplying (2^64 - 1) * (2^63 - 1), which produces the largest intermediate
+  // result that does not require int256. It overflows because the result is larger than
+  // MAX_UNSCALED_DECIMAL16.
+  { "cast(18446744073709551615 as decimal(38,0)) * cast(9223372036854775807 as decimal(38,0))",
+    {{ true, 0, 38, 0 }}},
+  { "cast(-18446744073709551615 as decimal(38,0)) * cast(9223372036854775807 as decimal(38,0))",
+    {{ true, 0, 38, 0 }}},
+  // int256 is required. We are multiplying (2^64 - 1) * (2^63) here.
+  { "cast(18446744073709551615 as decimal(38,0)) * cast(9223372036854775808 as decimal(38,0))",
+    {{ true, 0, 38, 0 }}},
+  { "cast(-18446744073709551615 as decimal(38,0)) * cast(9223372036854775808 as decimal(38,0))",
+    {{ true, 0, 38, 0 }}},
+  // Largest intermediate result that does not require int256.
+  { "cast(1844674407370.9551615 as decimal(38,7)) * cast(9223372036854775807 as decimal(38,0))",
+    {{ true, 0, 38, 7 },
+     { false, StringToInt128("17014118346046923170401718760531977831"), 38, 6 }}},
+  { "cast(-1844674407370.9551615 as decimal(38,7)) * cast(9223372036854775807 as decimal(38,0))",
+    {{ true, 0, 38, 7 },
+     { false, StringToInt128("-17014118346046923170401718760531977831"), 38, 6 }}},
+  // int256 is required.
+  { "cast(1844674407370.9551615 as decimal(38,7)) * cast(9223372036854775808 as decimal(38,0))",
+    {{ true, 0, 38, 7 },
+     { false, StringToInt128("17014118346046923172246393167902932992"), 38, 6 }}},
+  { "cast(-1844674407370.9551615 as decimal(38,7)) * cast(9223372036854775808 as decimal(38,0))",
+    {{ true, 0, 38, 7 },
+     { false, StringToInt128("-17014118346046923172246393167902932992"), 38, 6 }}},
+  // Smallest intermediate value that requires double checking if int256 is required.
+  { "cast(9223372036854775808 as decimal(38,0)) * cast(9223372036854775808 as decimal(38,0))",
+    {{ false, StringToInt128("85070591730234615865843651857942052864"), 38, 0 }}},
+  { "cast(-9223372036854775808 as decimal(38,0)) * cast(9223372036854775808 as decimal(38,0))",
+    {{ false, StringToInt128("-85070591730234615865843651857942052864"), 38, 0 }}},
+  // Scale down the intermediate value by 10^39.
+  { "cast(0.33333333333333333333333333333333333333 as decimal(38,38)) * "
+      "cast(0.00000000000000000000000000000000000003 as decimal(38,38))",
+    {{ false, 0, 38, 38 },
+     { false, 0, 38, 37 }}},
+  { "cast(0.33333333333333333333333333333333333333 as decimal(38,38)) * "
+      "cast(0.3 as decimal(38,38))",
+    {{ false, StringToInt128("9999999999999999999999999999999999999"), 38, 38 },
+     { false, StringToInt128("1000000000000000000000000000000000000"), 38, 37 }}},
+  { "cast(10000000000000000000 as decimal(21,0)) * "
+      "cast(10000000000000000000 as decimal(21,0))",
+    {{ true, 0, 38, 0 }}},
+  { "cast(1000000000000000.0000 as decimal(38,4)) * "
+      "cast(1000000000000000.0000 as decimal(38,4))",
+    {{ true, 0, 38, 8 },
+     { false, StringToInt128("1000000000000000000000000000000000000"), 38, 6 }}},
+  { "cast(-1000000000000000.0000 as decimal(38,4)) * "
+      "cast(1000000000000000.0000 as decimal(38,4))",
+    {{ true, 0, 38, 8 },
+     { false, StringToInt128("-1000000000000000000000000000000000000"), 38, 6 }}},
+  // Smallest 256 bit intermediate result that can't be scaled down to 128 bits.
+  { "cast(10000000000000000.0000 as decimal(38,4)) * "
+      "cast(10000000000000000.0000 as decimal(38,4))",
+    {{ true, 0, 38, 8 },
+     { true, 0, 38, 6 }}},
+  { "cast(-10000000000000000.0000 as decimal(38,4)) * "
+      "cast(10000000000000000.0000 as decimal(38,4))",
+    {{ true, 0, 38, 8 },
+     { true, 0, 38, 6 }}},
+  // The reason why the result of (38,38) * (38,38) is (38,37).
+  { "cast(0.99999999999999999999999999999999999999 as decimal(38,38)) * "
+      "cast(0.99999999999999999999999999999999999999 as decimal(38,38))",
+    {{ false, StringToInt128("99999999999999999999999999999999999998"), 38, 38 },
+     { false, StringToInt128("10000000000000000000000000000000000000"), 38, 37 }}},
+  { "cast(-0.99999999999999999999999999999999999999 as decimal(38,38)) * "
+      "cast(0.99999999999999999999999999999999999999 as decimal(38,38))",
+    {{ false, StringToInt128("-99999999999999999999999999999999999998"), 38, 38 },
+     { false, StringToInt128("-10000000000000000000000000000000000000"), 38, 37 }}},
+  { "cast(99999999999999999999999999999999999999 as decimal(38,0)) * "
+      "cast(1 as decimal(38,0))",
+    {{ false, StringToInt128("99999999999999999999999999999999999999"), 38, 0 }}},
+  { "cast(99999999999999999999999999999999999999 as decimal(38,0)) * "
+      "cast(-1 as decimal(38,0))",
+    {{ false, StringToInt128("-99999999999999999999999999999999999999"), 38, 0 }}},
+  // Rounding.
+  { "cast(0.000005 as decimal(38,6)) * cast(0.1 as decimal(38,1))",
+    {{ false, 5, 38, 7 },
+     { false, 1, 38, 6 }}},
+  { "cast(-0.000005 as decimal(38,6)) * cast(0.1 as decimal(38,1))",
+    {{ false, -5, 38, 7 },
+     { false, -1, 38, 6 }}},
+  { "cast(0.000004 as decimal(38,6)) * cast(0.1 as decimal(38,1))",
+    {{ false, 4, 38, 7 },
+     { false, 0, 38, 6 }}},
+  { "cast(-0.000004 as decimal(38,6)) * cast(0.1 as decimal(38,1))",
+    {{ false, -4, 38, 7 },
+     { false, 0, 38, 6 }}},
   // Test divide operator
   { "cast(1.23 as decimal(8,2)) / cast(1 as decimal(4,3))", {{ false, 12300000, 16, 7}}},
   { "cast(1.23 as decimal(30,2)) / cast(1 as decimal(20,3))",
@@ -2290,7 +2533,7 @@ TEST_F(ExprTest, DecimalV2MixedArithmeticExprs) {
   TestDecimalValue(
       "10.0 - 3", Decimal4Value(70), ColumnType::CreateDecimalType(5, 1));
   TestDecimalValue(
-      "10.0 * 3", Decimal4Value(300), ColumnType::CreateDecimalType(6, 1));
+      "10.0 * 3", Decimal4Value(300), ColumnType::CreateDecimalType(7, 1));
   TestDecimalValue(
       "10.0 / 3", Decimal4Value(3333333), ColumnType::CreateDecimalType(8, 6));
   // Comparisons between DECIMAL values are precise.

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/62596410/be/src/runtime/decimal-value.inline.h
----------------------------------------------------------------------
diff --git a/be/src/runtime/decimal-value.inline.h b/be/src/runtime/decimal-value.inline.h
index f06ba76..20a7b66 100644
--- a/be/src/runtime/decimal-value.inline.h
+++ b/be/src/runtime/decimal-value.inline.h
@@ -27,6 +27,7 @@
 #include <sstream>
 
 #include "common/logging.h"
+#include "util/bit-util.h"
 #include "util/decimal-util.h"
 #include "util/hash-util.h"
 
@@ -204,7 +205,7 @@ inline RESULT_T ScaleDownAndRound(RESULT_T value, int delta_scale, bool
round) {
     RESULT_T remainder = value % multiplier;
     // In general, shifting down the multiplier is not safe, but we know
     // here that it is a multiple of two.
-    if (abs(remainder) > (multiplier >> 1)) {
+    if (abs(remainder) >= (multiplier >> 1)) {
       // Bias at zero must be corrected by sign of dividend.
       result += BitUtil::Sign(value);
     }
@@ -213,44 +214,6 @@ inline RESULT_T ScaleDownAndRound(RESULT_T value, int delta_scale, bool
round) {
 }
 }
 
-// Use __builtin_mul_overflow on GCC if available.
-// Avoid using on Clang: it requires a function __muloti present in the Clang runtime
-// library but not the GCC runtime library and regresses performance.
-#if 5 <= __GNUC__
-template<typename T>
-template<typename RESULT_T>
-inline DecimalValue<RESULT_T> DecimalValue<T>::Multiply(int this_scale,
-    const DecimalValue& other, int other_scale, int result_precision, int result_scale,
-    bool round, bool* overflow) const {
-  // In the non-overflow case, we don't need to adjust by the scale since
-  // that is already handled by the FE when it computes the result decimal type.
-  // e.g. 1.23 * .2 (scale 2, scale 1 respectively) is identical to:
-  // 123 * 2 with a resulting scale 3. We can do the multiply on the unscaled values.
-  // The result scale in this case is the sum of the input scales.
-  RESULT_T x = value();
-  RESULT_T y = other.value();
-  RESULT_T result = 0;
-  if (result_precision == ColumnType::MAX_PRECISION) {
-    DCHECK_EQ(sizeof(RESULT_T), 16);
-    *overflow |= __builtin_mul_overflow(x, y, &result);
-  } else {
-    result = x * y;
-  }
-  int delta_scale = this_scale + other_scale - result_scale;
-  if (UNLIKELY(delta_scale != 0)) {
-    // In this case, the required resulting scale is larger than the max we support.
-    // We cap the resulting scale to the max supported scale (e.g. truncate) in the FE.
-    // TODO: we could also return NULL.
-    result = detail::ScaleDownAndRound<T, RESULT_T>(result, delta_scale, round);
-
-    // Check overflow again after rounding since +/-1 could cause decimal overflow
-    if (round && result_precision == ColumnType::MAX_PRECISION) {
-      *overflow |= abs(result) > DecimalUtil::MAX_UNSCALED_DECIMAL16;
-    }
-  }
-  return DecimalValue<RESULT_T>(result);
-}
-#else
 template<typename T>
 template<typename RESULT_T>
 DecimalValue<RESULT_T> DecimalValue<T>::Multiply(int this_scale,
@@ -267,26 +230,72 @@ DecimalValue<RESULT_T> DecimalValue<T>::Multiply(int this_scale,
     // Handle zero to avoid divide by zero in the overflow check below.
     return DecimalValue<RESULT_T>(0);
   }
+  RESULT_T result = 0;
+  bool needs_int256 = false;
+  int delta_scale = this_scale + other_scale - result_scale;
   if (result_precision == ColumnType::MAX_PRECISION) {
     DCHECK_EQ(sizeof(RESULT_T), 16);
-    // Check overflow
-    *overflow |= DecimalUtil::MAX_UNSCALED_DECIMAL16 / abs(y) < abs(x);
+    int total_leading_zeros = BitUtil::CountLeadingZeros(abs(x)) +
+        BitUtil::CountLeadingZeros(abs(y));
+    // This check is quick, but conservative. In some cases it will indicate that
+    // converting to 256 bits is necessary, when it's not actually the case.
+    needs_int256 = total_leading_zeros <= 128;
+    if (UNLIKELY(needs_int256 && delta_scale == 0)) {
+      if (LIKELY(abs(x) > DecimalUtil::MAX_UNSCALED_DECIMAL16 / abs(y))) {
+        // If the intermediate value does not fit into 128 bits, we indicate overflow
+        // because the final value would also not fit into 128 bits since delta_scale is
+        // zero.
+        *overflow = true;
+      } else {
+        // We've verified that the intermediate (and final) value will fit into 128 bits.
+        needs_int256 = false;
+      }
+    }
   }
-  RESULT_T result = x * y;
-  int delta_scale = this_scale + other_scale - result_scale;
-  if (UNLIKELY(delta_scale != 0)) {
-    // In this case, the required resulting scale is larger than the max we support.
-    // We cap the resulting scale to the max supported scale (e.g. truncate) in the FE.
-    // TODO: we could also return NULL.
-    result = detail::ScaleDownAndRound<T, RESULT_T>(result, delta_scale, round);
-    // Check overflow again after rounding since +/-1 could cause decimal overflow
-    if (result_precision == ColumnType::MAX_PRECISION) {
-      *overflow |= abs(result) > DecimalUtil::MAX_UNSCALED_DECIMAL16;
+  if (UNLIKELY(needs_int256)) {
+    if (delta_scale == 0) {
+      DCHECK(*overflow);
+    } else {
+      int256_t intermediate_result = ConvertToInt256(x) * ConvertToInt256(y);
+      intermediate_result = detail::ScaleDownAndRound<int256_t, int256_t>(
+          intermediate_result, delta_scale, round);
+      result = ConvertToInt128(
+          intermediate_result, DecimalUtil::MAX_UNSCALED_DECIMAL16, overflow);
+    }
+  } else {
+    if (delta_scale == 0) {
+      result = x * y;
+      if (UNLIKELY(result_precision == ColumnType::MAX_PRECISION &&
+          abs(result) > DecimalUtil::MAX_UNSCALED_DECIMAL16)) {
+        // An overflow is possible here, if, for example, x = (2^64 - 1) and
+        // y = (2^63 - 1).
+        *overflow = true;
+      }
+    } else if (LIKELY(delta_scale <= 38)) {
+      result = x * y;
+      // The largest value that result can have here is (2^64 - 1) * (2^63 - 1), which is
+      // greater than MAX_UNSCALED_DECIMAL16.
+      result = detail::ScaleDownAndRound<T, RESULT_T>(result, delta_scale, round);
+      // Since delta_scale is greater than zero, result can now be at most
+      // ((2^64 - 1) * (2^63 - 1)) / 10, which is less than MAX_UNSCALED_DECIMAL16, so
+      // there is no need to check for overflow.
+    } else {
+      // We are multiplying decimal(38, 38) by decimal(38, 38). The result should be a
+      // decimal(38, 37), so delta scale = 38 + 38 - 37 = 39. Since we are not in the
+      // 256 bit intermediate value case and we are scaling down by 39, then we are
+      // guaranteed that the result is 0 (even if we try to round). The largest possible
+      // intermediate result is 38 "9"s. If we scale down by 39, the leftmost 9 is now
+      // two digits to the right of the rightmost "visible" one. The reason why we have
+      // to handle this case separately is because a scale multiplier with a delta_scale
+      // 39 does not fit into int128.
+      DCHECK_EQ(delta_scale, 39);
+      DCHECK(round);
+      result = 0;
     }
   }
+  DCHECK(*overflow || abs(result) <= DecimalUtil::MAX_UNSCALED_DECIMAL16);
   return DecimalValue<RESULT_T>(result);
 }
-#endif
 
 template<typename T>
 template<typename RESULT_T>

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/62596410/be/src/util/bit-util.h
----------------------------------------------------------------------
diff --git a/be/src/util/bit-util.h b/be/src/util/bit-util.h
index 581decc..e82f4fc 100644
--- a/be/src/util/bit-util.h
+++ b/be/src/util/bit-util.h
@@ -33,6 +33,7 @@
 
 #include "common/compiler-util.h"
 #include "gutil/bits.h"
+#include "runtime/multi-precision.h"
 #include "util/cpu-info.h"
 #include "util/sse-util.h"
 
@@ -61,7 +62,7 @@ class BitUtil {
   /// Return an integer signifying the sign of the value, returning +1 for
   /// positive integers (and zero), -1 for negative integers.
   /// The extra shift is to silence GCC warnings about full width shift on
-  /// unsigned types.  It compiles out in optimized builds into the expected increment
+  /// unsigned types. It compiles out in optimized builds into the expected increment.
   template<typename T>
   constexpr static inline T Sign(T value) {
     return 1 | ((value >> (UnsignedWidth<T>() - 1)) >> 1);
@@ -286,6 +287,16 @@ class BitUtil {
     return __builtin_ctzll(v);
   }
 
+  static inline int CountLeadingZeros(unsigned __int128 v) {
+    if (UNLIKELY(v == 0)) return 128;
+    unsigned __int128 shifted = v >> 64;
+    if (shifted != 0) {
+      return __builtin_clzll(shifted);
+    } else {
+      return __builtin_clzll(v) + 64;
+    }
+  }
+
   // Wrap the gutil/ version for convenience.
   static inline int Log2Floor(uint32_t n) {
     return Bits::Log2Floor(n);
@@ -336,6 +347,11 @@ class BitUtil {
   }
 };
 
+template<>
+inline int256_t BitUtil::Sign(int256_t value) {
+  return value < 0 ? -1 : 1;
+}
+
 /// An encapsulation class of SIMD byteswap functions
 class SimdByteSwap {
  public:

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/62596410/fe/src/main/java/org/apache/impala/analysis/TypesUtil.java
----------------------------------------------------------------------
diff --git a/fe/src/main/java/org/apache/impala/analysis/TypesUtil.java b/fe/src/main/java/org/apache/impala/analysis/TypesUtil.java
index b29fc05..1bd34ef 100644
--- a/fe/src/main/java/org/apache/impala/analysis/TypesUtil.java
+++ b/fe/src/main/java/org/apache/impala/analysis/TypesUtil.java
@@ -225,7 +225,7 @@ public class TypesUtil {
    * http://blogs.msdn.com/b/sqlprogrammability/archive/2006/03/29/564110.aspx
    * https://msdn.microsoft.com/en-us/library/ms190476.aspx
    *
-   * TODO: implement V2 rules for ADD/SUB/MULTIPLY.
+   * TODO: implement V2 rules for ADD/SUB.
    *
    * Changes:
    *  - There are slight difference with how precision/scale reduction occurs compared
@@ -255,9 +255,12 @@ public class TypesUtil {
         resultScale = Math.max(s1, s2);
         resultPrecision = Math.min(p1 - s1, p2 - s2) + resultScale;
         break;
+      case MULTIPLY:
+        resultScale = s1 + s2;
+        resultPrecision = p1 + p2 + 1;
+        break;
       case ADD:
       case SUBTRACT:
-      case MULTIPLY:
       default:
         // Not yet implemented - fall back to V1 rules.
         return getDecimalArithmeticResultTypeV1(t1, t2, op);

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/62596410/fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
----------------------------------------------------------------------
diff --git a/fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java b/fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
index 8891fee..724d437 100644
--- a/fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
+++ b/fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
@@ -1442,7 +1442,7 @@ public class AnalyzeExprsTest extends AnalyzerTest {
     checkDecimalReturnType("select d1 - 1.1 from functional.decimal_tbl",
         ScalarType.createDecimalType(11, 1));
     checkDecimalReturnType("select d1 * 1.1 from functional.decimal_tbl",
-        ScalarType.createDecimalType(11, 1));
+        ScalarType.createDecimalType(11, 1), ScalarType.createDecimalType(12, 1));
     checkDecimalReturnType("select d1 / 1.1 from functional.decimal_tbl",
         ScalarType.createDecimalType(14, 4), ScalarType.createDecimalType(16, 6));
     checkDecimalReturnType("select d1 % 1.1 from functional.decimal_tbl",
@@ -1453,7 +1453,7 @@ public class AnalyzeExprsTest extends AnalyzerTest {
     checkDecimalReturnType("select d1 - 2 from functional.decimal_tbl",
         ScalarType.createDecimalType(10, 0));
     checkDecimalReturnType("select d1 * 2 from functional.decimal_tbl",
-        ScalarType.createDecimalType(12, 0));
+        ScalarType.createDecimalType(12, 0), ScalarType.createDecimalType(13, 0));
     checkDecimalReturnType("select d1 / 2 from functional.decimal_tbl",
         ScalarType.createDecimalType(13, 4), ScalarType.createDecimalType(15, 6));
     checkDecimalReturnType("select d1 % 2 from functional.decimal_tbl",
@@ -1465,7 +1465,7 @@ public class AnalyzeExprsTest extends AnalyzerTest {
     checkDecimalReturnType("select int_col - 1.1 from functional.alltypestiny",
         Type.DOUBLE, ScalarType.createDecimalType(12, 1));
     checkDecimalReturnType("select int_col * 1.1 from functional.alltypestiny",
-        Type.DOUBLE, ScalarType.createDecimalType(12, 1));
+        Type.DOUBLE, ScalarType.createDecimalType(13, 1));
     checkDecimalReturnType("select int_col / 1.1 from functional.alltypestiny",
         Type.DOUBLE, ScalarType.createDecimalType(17, 6));
     checkDecimalReturnType("select int_col % 1.1 from functional.alltypestiny",


Mime
View raw message