drill-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bridg...@apache.org
Subject [19/31] drill git commit: Update 050-json-data-model.md
Date Wed, 25 Nov 2015 22:03:07 GMT
Update 050-json-data-model.md

Add documentation for hetergenous types (Union type).

Project: http://git-wip-us.apache.org/repos/asf/drill/repo
Commit: http://git-wip-us.apache.org/repos/asf/drill/commit/ad8b2f3f
Tree: http://git-wip-us.apache.org/repos/asf/drill/tree/ad8b2f3f
Diff: http://git-wip-us.apache.org/repos/asf/drill/diff/ad8b2f3f

Branch: refs/heads/gh-pages
Commit: ad8b2f3fff5698344d9f4d9b72c5e28025c0533c
Parents: b5bb45f
Author: Steven Phillips <smp@apache.org>
Authored: Sun Nov 22 12:00:44 2015 -0800
Committer: Kristine Hahn <khahn@maprtech.com>
Committed: Wed Nov 25 10:13:43 2015 -0800

 .../050-json-data-model.md                      | 21 ++++++++++++++++++--
 1 file changed, 19 insertions(+), 2 deletions(-)

diff --git a/_docs/data-sources-and-file-formats/050-json-data-model.md b/_docs/data-sources-and-file-formats/050-json-data-model.md
index 4cd5281..48ea264 100644
--- a/_docs/data-sources-and-file-formats/050-json-data-model.md
+++ b/_docs/data-sources-and-file-formats/050-json-data-model.md
@@ -60,6 +60,21 @@ When you set this option, Drill reads all numbers from the JSON files as
 Drill uses these types internally for reading complex and nested data structures from data
sources such as JSON.
+### Experimental Feature: Heterogeneous types
+The Union type allows storing different types in the same field. This new feature is still
considered experimental, and must be explicitly enabled by setting the `exec.enabel_union_type`
option to true.
+    ALTER SESSION SET `exec.enable_union_type` = true;
+ With this feature enabled, JSON data with changing types, which previously could not be
queried by drill, are now queryable.
+A field with a Union type can be used inside of functions. Drill will automatically handle
evaluation of the function appropriately for each type. If the data requires special handling
for the different types, you can do this with case statement, leveraging the new `type` functions:
+select 1 + case when is_list(a) then a[0] else a end from table;
+In this example, the column a contains both scalar and list types, so the case where it is
a list is handled by using the first element of the array.
 ## Reading JSON
 To read JSON data using Drill, use a [file system storage plugin]({{ site.baseurl }}/docs/file-system-storage-plugin/)
that defines the JSON format. You can use the `dfs` storage plugin, which includes the definition.
@@ -464,6 +479,8 @@ Workaround: None, per se, but if you avoid querying the multi-polygon
lines (120
     | -122.4379846573301  | 37.75844260679518  | 0.0     |
     1 row selected (6.64 seconds)
+Another option is to use the experimental union type.
 ### Varying types
 Any attempt to query a list that has
@@ -472,7 +489,7 @@ coordinates for shapes of type Polygon.  For shapes of MultiPolygon, this
file h
 of coordinates. Even a query that tries to filter away the
 MultiPolygons will fail.
-Workaround: None.
+Workaround: Use union type (experimental).
 ### Misusing Dot Notation
 Drill accesses an object when you use dot notation in the SELECT statement only when the
dot is *not* the first dot in the expression. Drill attempts to access the table that appears
after the first dot. For example,  records in `some-file` have a geometry field that Drill
successfully accesses given this query:
@@ -516,4 +533,4 @@ Workaround: Set the `store.json.read_numbers_as_double` property, described
 ### Selecting all in a JSON directory query
 Drill currently returns only fields common to all the files in a [directory query]({{ site.baseurl
}}/docs/querying-directories) that selects all (SELECT *) JSON files.
-Workaround: Query each file individually.
+Workaround: Query each file individually. Another option is to use the union type (experimental).

View raw message