kudu-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From granthe...@apache.org
Subject kudu git commit: [doc] Document the new decimal column type
Date Tue, 13 Mar 2018 02:47:34 GMT
Repository: kudu
Updated Branches:
  refs/heads/master 4cd6338e6 -> 0a37d1f3b


[doc] Document the new decimal column type

Change-Id: I9489613d35daad708648ea04d49e472d3149b33d
Reviewed-on: http://gerrit.cloudera.org:8080/9432
Reviewed-by: Grant Henke <granthenke@gmail.com>
Tested-by: Grant Henke <granthenke@gmail.com>


Project: http://git-wip-us.apache.org/repos/asf/kudu/repo
Commit: http://git-wip-us.apache.org/repos/asf/kudu/commit/0a37d1f3
Tree: http://git-wip-us.apache.org/repos/asf/kudu/tree/0a37d1f3
Diff: http://git-wip-us.apache.org/repos/asf/kudu/diff/0a37d1f3

Branch: refs/heads/master
Commit: 0a37d1f3be2ec08a3e295b03e5907a7d878eb753
Parents: 4cd6338
Author: Grant Henke <granthenke@gmail.com>
Authored: Mon Feb 19 15:50:06 2018 -0600
Committer: Grant Henke <granthenke@gmail.com>
Committed: Tue Mar 13 02:19:46 2018 +0000

----------------------------------------------------------------------
 docs/developing.adoc              |  4 +--
 docs/known_issues.adoc            |  4 ++-
 docs/kudu_impala_integration.adoc |  3 +--
 docs/release_notes.adoc           | 13 +++++++++
 docs/schema_design.adoc           | 49 +++++++++++++++++++++++++++++++---
 5 files changed, 65 insertions(+), 8 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/kudu/blob/0a37d1f3/docs/developing.adoc
----------------------------------------------------------------------
diff --git a/docs/developing.adoc b/docs/developing.adoc
index eb8b2c6..09ffb82 100644
--- a/docs/developing.adoc
+++ b/docs/developing.adoc
@@ -180,8 +180,8 @@ name and keytab location must be provided through the `--principal` and
 - `<>` and `OR` predicates are not pushed to Kudu, and instead will be evaluated
   by the Spark task. Only `LIKE` predicates with a suffix wildcard are pushed to
   Kudu, meaning that `LIKE "FOO%"` is pushed down but `LIKE "FOO%BAR"` isn't.
-- Kudu does not support all types supported by Spark SQL, such as `Date`,
-  `Decimal` and complex types.
+- Kudu does not support every type supported by Spark SQL. For example,
+  `Date` and complex types are not supported.
 - Kudu tables may only be registered as temporary tables in SparkSQL.
   Kudu tables may not be queried using HiveContext.
 

http://git-wip-us.apache.org/repos/asf/kudu/blob/0a37d1f3/docs/known_issues.adoc
----------------------------------------------------------------------
diff --git a/docs/known_issues.adoc b/docs/known_issues.adoc
index 40e1c77..cd9dd05 100644
--- a/docs/known_issues.adoc
+++ b/docs/known_issues.adoc
@@ -51,10 +51,12 @@
 
 === Columns
 
-* DECIMAL, CHAR, VARCHAR, DATE, and complex types like ARRAY are not supported.
+* CHAR, VARCHAR, DATE, and complex types like ARRAY are not supported.
 
 * Type and nullability of existing columns cannot be changed by altering the table.
 
+* The precision and scale of `DECIMAL` columns cannot be changed by altering the table.
+
 * Tables can have a maximum of 300 columns.
 
 === Tables

http://git-wip-us.apache.org/repos/asf/kudu/blob/0a37d1f3/docs/kudu_impala_integration.adoc
----------------------------------------------------------------------
diff --git a/docs/kudu_impala_integration.adoc b/docs/kudu_impala_integration.adoc
index 9d2e7b0..a43b7ff 100755
--- a/docs/kudu_impala_integration.adoc
+++ b/docs/kudu_impala_integration.adoc
@@ -735,8 +735,7 @@ The examples above have only explored a fraction of what you can do with
Impala
   to work around this issue.
 - When creating a Kudu table, the `CREATE TABLE` statement must include the
   primary key columns before other columns, in primary key order.
-- Impala can not create Kudu tables with `DECIMAL`, `VARCHAR`,
-  or nested-typed columns.
+- Impala can not create Kudu tables with `VARCHAR` or nested-typed columns.
 - Impala cannot update values in primary key columns.
 - `!=` and `LIKE` predicates are not pushed to Kudu, and
   instead will be evaluated by the Impala scan node. This may decrease performance

http://git-wip-us.apache.org/repos/asf/kudu/blob/0a37d1f3/docs/release_notes.adoc
----------------------------------------------------------------------
diff --git a/docs/release_notes.adoc b/docs/release_notes.adoc
index 6e1dc2b..916282c 100644
--- a/docs/release_notes.adoc
+++ b/docs/release_notes.adoc
@@ -50,6 +50,13 @@
 [[rn_1.7.0_new_features]]
 == New features
 
+* Kudu now supports the decimal column type. The decimal type is a numeric data type
+  with fixed scale and precision suitable for financial and other arithmetic
+  calculations where the imprecise representation and rounding behavior of float and
+  double make those types impractical. The decimal type is also useful for integers
+  larger than int64 and cases with fractional values in a primary key.
+  See link:schema_design.html#decimal[Decimal Type] for more details.
+
 * The `kudu fs update_dirs` tool now supports removing directories. Unless the
   `--force` flag is specified, Kudu will not allow the removal of a directory
   across which tablets are configured to spread data. If specified, all tablet
@@ -122,6 +129,12 @@ on wire compatibility between Kudu 1.7 and versions earlier than 1.3:
   written against Kudu 1.6 will continue to run against the Kudu 1.7 client
   and vice-versa.
 
+* Kudu 1.7 clients that attempt to create a table with a decimal column on a
+  target server running Kudu 1.6 or earlier will receive an error response.
+  Similarly Kudu clients running Kudu 1.6 or earlier will result in an error
+  when attempting to access any table containing containing a decimal
+  column.
+
 [[rn_1.7.0_known_issues]]
 == Known Issues and Limitations
 

http://git-wip-us.apache.org/repos/asf/kudu/blob/0a37d1f3/docs/schema_design.adoc
----------------------------------------------------------------------
diff --git a/docs/schema_design.adoc b/docs/schema_design.adoc
index 7f0e218..02d05ea 100644
--- a/docs/schema_design.adoc
+++ b/docs/schema_design.adoc
@@ -73,6 +73,7 @@ column types include:
 * unixtime_micros (64-bit microseconds since the Unix epoch)
 * single-precision (32-bit) IEEE-754 floating-point number
 * double-precision (64-bit) IEEE-754 floating-point number
+* decimal (see <<decimal>> for details)
 * UTF-8 encoded string (up to 64KB uncompressed)
 * binary (up to 64KB uncompressed)
 
@@ -90,6 +91,48 @@ Unlike HBase, Kudu does not provide a version or timestamp column to track
chang
 to a row. If version or timestamp information is needed, the schema should include
 an explicit version or timestamp column.
 
+[[decimal]]
+=== Decimal Type
+
+The `decimal` type is a numeric data type with fixed scale and precision suitable for
+financial and other arithmetic calculations where the imprecise representation and
+rounding behavior of `float` and `double` make those types impractical. The `decimal`
+type is also useful for integers larger than int64 and cases with fractional values
+in a primary key.
+
+The `decimal` type is a parameterized type that takes precision and scale type
+attributes.
+
+*Precision* represents the total number of digits that can be represented by the
+column, regardless of the location of the decimal point. This value must be between
+1 and 38 and has no default. For example, a precision of 4 is required to represent
+integer values up to 9999, or to represent values up to 99.99 with two fractional
+digits. You can also represent corresponding negative values, without any
+change in the precision. For example, the range -9999 to 9999 still only requires
+a precision of 4.
+
+*Scale* represents the number of fractional digits. This value must be between 0
+and the precision. A scale of 0 produces integral values, with no fractional part.
+If precision and scale are equal, all of the digits come after the decimal point.
+For example, a decimal with precision and scale equal to 3 can represent values
+between -0.999 and 0.999.
+
+*Performance considerations:*
+
+Kudu stores each value in as few bytes as possible depending on the precision
+specified for the decimal column. For that reason it is not advised to just use
+the highest precision possible for convenience. Doing so could negatively impact
+performance, memory and storage.
+
+Before encoding and compression:
+
+* Decimal values with precision of 9 or less are stored in 4 bytes.
+* Decimal values with precision of 10 through 18 are stored in 8 bytes.
+* Decimal values with precision greater than 18 are stored in 16 bytes.
+
+NOTE: The precision and scale of `decimal` columns cannot be changed by altering
+the table.
+
 [[encoding]]
 === Column Encoding
 
@@ -102,7 +145,7 @@ of the column.
 | Column Type             | Encoding                       | Default
 | int8, int16, int32      | plain, bitshuffle, run length  | bitshuffle
 | int64, unixtime_micros  | plain, bitshuffle, run length  | bitshuffle
-| float, double           | plain, bitshuffle              | bitshuffle
+| float, double, decimal  | plain, bitshuffle              | bitshuffle
 | bool                    | plain, run length              | run length
 | string, binary          | plain, prefix, dictionary      | dictionary
 |===
@@ -160,8 +203,8 @@ Like an RDBMS primary key, the Kudu primary key enforces a uniqueness
constraint
 Attempting to insert a row with the same primary key values as an existing row
 will result in a duplicate key error.
 
-Primary key columns must be non-nullable, and may not be a boolean or floating-
-point type.
+Primary key columns must be non-nullable, and may not be a boolean, float
+or double type.
 
 Once set during table creation, the set of columns in the primary key may not
 be altered.


Mime
View raw message