Return-Path: X-Original-To: apmail-drill-commits-archive@www.apache.org Delivered-To: apmail-drill-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8EA88188B2 for ; Wed, 16 Dec 2015 05:14:55 +0000 (UTC) Received: (qmail 66425 invoked by uid 500); 16 Dec 2015 05:14:55 -0000 Delivered-To: apmail-drill-commits-archive@drill.apache.org Received: (qmail 66393 invoked by uid 500); 16 Dec 2015 05:14:55 -0000 Mailing-List: contact commits-help@drill.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: commits@drill.apache.org Delivered-To: mailing list commits@drill.apache.org Received: (qmail 66333 invoked by uid 99); 16 Dec 2015 05:14:55 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 16 Dec 2015 05:14:55 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id 3C63FE0914; Wed, 16 Dec 2015 05:14:55 +0000 (UTC) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit From: tshiran@apache.org To: commits@drill.apache.org Date: Wed, 16 Dec 2015 05:14:56 -0000 Message-Id: <60a6507bb77f43608a0bf1038851b630@git.apache.org> In-Reply-To: <7b06ec5c9a414c49a8e33309c59575e8@git.apache.org> References: <7b06ec5c9a414c49a8e33309c59575e8@git.apache.org> X-Mailer: ASF-Git Admin Mailer Subject: [2/3] drill-site git commit: Update website http://git-wip-us.apache.org/repos/asf/drill-site/blob/90078fe1/docs/installing-the-driver-on-windows/index.html ---------------------------------------------------------------------- diff --git a/docs/installing-the-driver-on-windows/index.html b/docs/installing-the-driver-on-windows/index.html index 129469d..2c40f83 100644 --- a/docs/installing-the-driver-on-windows/index.html +++ b/docs/installing-the-driver-on-windows/index.html @@ -1086,7 +1086,7 @@ Example: 127.0.0.1 localhost


-

Step 1: Download the MapR Drill ODBC Driver

+

Step 1: Download the MapR Drill ODBC Driver

Download the installer that corresponds to the bitness of the client application from which you want to create an ODBC connection:

@@ -1097,7 +1097,7 @@ Example: 127.0.0.1 localhost


-

Step 2: Install the MapR Drill ODBC Driver

+

Step 2: Install the MapR Drill ODBC Driver

  1. Double-click the installer from the location where you downloaded it.
  2. @@ -1110,7 +1110,7 @@ Example: 127.0.0.1 localhost


    -

    Step 3: Verify the installation

    +

    Step 3: Verify the installation

    To verify the installation, perform the following steps:

    @@ -1127,7 +1127,7 @@ The ODBC Data Source Administrator dialog appears.

    You need to configure and start Drill before testing the ODBC Data Source Administrator.

    -

    The Tableau Data-connection Customization (TDC) File

    +

    The Tableau Data-connection Customization (TDC) File

    The MapR Drill ODBC Driver includes a file named MapRDrillODBC.TDC. The TDC file includes customizations that improve ODBC configuration and performance when using Tableau.

    http://git-wip-us.apache.org/repos/asf/drill-site/blob/90078fe1/docs/json-data-model/index.html ---------------------------------------------------------------------- diff --git a/docs/json-data-model/index.html b/docs/json-data-model/index.html index 9ef72c3..8f80749 100644 --- a/docs/json-data-model/index.html +++ b/docs/json-data-model/index.html @@ -1135,7 +1135,7 @@ Reads all data from JSON files as VARCHAR. You need to cast numbers from VARCHAR

    Drill uses these types internally for reading complex and nested data structures from data sources such as JSON.

    -

    Experimental Feature: Heterogeneous types

    +

    Experimental Feature: Heterogeneous types

    The Union type allows storing different types in the same field. This new feature is still considered experimental, and must be explicitly enabled by setting the exec.enabel_union_type option to true.

    ALTER SESSION SET `exec.enable_union_type` = true;
    @@ -1231,11 +1231,11 @@ y[z].x because these references are not ambiguous. Observe the following guideli
     
  3. Generate key/value pairs for loosely structured data
  4. -

    Example: Flatten and Generate Key Values for Complex JSON

    +

    Example: Flatten and Generate Key Values for Complex JSON

    This example uses the following data that represents unit sales of tickets to events that were sold over a period of several days in December:

    -

    ticket_sales.json Contents

    +

    ticket_sales.json Contents

    {
       "type": "ticket",
       "venue": 123455,
    @@ -1266,7 +1266,7 @@ y[z].x because these references are not ambiguous. Observe the following guideli
     +---------+---------+---------------------------------------------------------------+
     2 rows selected (1.343 seconds)
     
    -

    Generate Key/Value Pairs

    +

    Generate Key/Value Pairs

    Continuing with the data from previous example, use the KVGEN (Key Value Generator) function to generate key/value pairs from complex data. Generating key/value pairs is often helpful when working with data that contains arbitrary maps consisting of dynamic and unknown element names, such as the ticket sales data in this example. For example purposes, take a look at how kvgen breaks the sales data into keys and values representing the key dates and number of tickets sold:

    SELECT KVGEN(tkt.sales) AS `key dates:tickets sold` FROM dfs.`/Users/drilluser/ticket_sales.json` tkt;
    @@ -1300,7 +1300,7 @@ FROM dfs.`/Users/drilluser/drill/ticket_sales.json`;
     +--------------------------------+
     8 rows selected (0.171 seconds)
     
    -

    Example: Aggregate Loosely Structured Data

    +

    Example: Aggregate Loosely Structured Data

    Use flatten and kvgen together to aggregate the data from the previous example. Make sure all text mode is set to false to sum numbers. Drill returns an error if you attempt to sum data in all text mode.

    ALTER SYSTEM SET `store.json.all_text_mode` = false;
    @@ -1315,7 +1315,7 @@ FROM dfs.`/Users/drilluser/drill/ticket_sales.json`;
     +--------------+
     1 row selected (0.244 seconds)
     
    -

    Example: Aggregate and Sort Data

    +

    Example: Aggregate and Sort Data

    Sum and group the ticket sales by date and sort in ascending order of total tickets sold.

    SELECT `right`(tkt.tot_sales.key,2) `December Date`,
    @@ -1336,7 +1336,7 @@ ORDER BY TotalSales;
     +----------------+-------------+
     5 rows selected (0.252 seconds)
     
    -

    Example: Access a Map Field in an Array

    +

    Example: Access a Map Field in an Array

    To access a map field in an array, use dot notation to drill down through the hierarchy of the JSON data to the field. Examples are based on the following City Lots San Francisco in .json.

    {
    @@ -1400,7 +1400,7 @@ FROM dfs.`/Users/drilluser/citylots.json`;
     
     

    More examples of drilling down into an array are shown in "Selecting Nested Data for a Column".

    -

    Example: Flatten an Array of Maps using a Subquery

    +

    Example: Flatten an Array of Maps using a Subquery

    By flattening the following JSON file, which contains an array of maps, you can evaluate the records of the flattened data.

    {"name":"classic","fillings":[ {"name":"sugar","cal":500} , {"name":"flour","cal":300} ] }
    @@ -1416,7 +1416,7 @@ SELECT flat.fill FROM (SELECT FLATTEN(t.fillings) AS fill FROM dfs.flatten.`test
     

    Use a table alias for column fields and functions when working with complex data sets. Currently, you must use a subquery when operating on a flattened column. Eliminating the subquery and table alias in the WHERE clause, for example flat.fillings[0].cal > 300, does not evaluate all records of the flattened data against the predicate and produces the wrong results.

    -

    Example: Access Map Fields in a Map

    +

    Example: Access Map Fields in a Map

    This example uses a WHERE clause to drill down to a third level of the following JSON hierarchy to get the max_hdl greater than 160:

    {
    
    http://git-wip-us.apache.org/repos/asf/drill-site/blob/90078fe1/docs/kvgen/index.html
    ----------------------------------------------------------------------
    diff --git a/docs/kvgen/index.html b/docs/kvgen/index.html
    index eedd76f..779e980 100644
    --- a/docs/kvgen/index.html
    +++ b/docs/kvgen/index.html
    @@ -1137,7 +1137,7 @@ array down into multiple distinct rows and further query those rows.

    {"key": "c", "value": "valC"} {"key": "d", "value": "valD"}
    -

    Example: Different Data Type Values

    +

    Example: Different Data Type Values

    Assume that a JSON file called kvgendata.json includes multiple records that look like this one:

    http://git-wip-us.apache.org/repos/asf/drill-site/blob/90078fe1/docs/lesson-1-learn-about-the-data-set/index.html ---------------------------------------------------------------------- diff --git a/docs/lesson-1-learn-about-the-data-set/index.html b/docs/lesson-1-learn-about-the-data-set/index.html index 1da374c..c76288b 100644 --- a/docs/lesson-1-learn-about-the-data-set/index.html +++ b/docs/lesson-1-learn-about-the-data-set/index.html @@ -1094,7 +1094,7 @@ the Drill shell, type:

    +-------+--------------------------------------------+ 1 row selected
    -

    List the available workspaces and databases:

    +

    List the available workspaces and databases:

    0: jdbc:drill:> show databases;
     +---------------------+
     |     SCHEMA_NAME     |
    @@ -1124,7 +1124,7 @@ different database schemas (namespaces) in a relational database system.

    This is a Hive external table pointing to the data stored in flat files on the MapR file system. The orders table contains 122,000 rows.

    -

    Set the schema to hive:

    +

    Set the schema to hive:

    0: jdbc:drill:> use hive.`default`;
     +-------+-------------------------------------------+
     |  ok   |                  summary                  |
    @@ -1136,7 +1136,7 @@ MapR file system. The orders table contains 122,000 rows.

    You will run the USE command throughout this tutorial. The USE command sets the schema for the current session.

    -

    Describe the table:

    +

    Describe the table:

    You can use the DESCRIBE command to show the columns and data types for a Hive table:

    @@ -1155,7 +1155,7 @@ table:

    The DESCRIBE command returns complete schema information for Hive tables based on the metadata available in the Hive metastore.

    -

    Select 5 rows from the orders table:

    +

    Select 5 rows from the orders table:

    0: jdbc:drill:> select * from orders limit 5;
     +------------+------------+------------+------------+------------+-------------+
     |  order_id  |   month    |  cust_id   |   state    |  prod_id   | order_total |
    @@ -1213,7 +1213,7 @@ columns typical of a time-series database.

    The customers table contains 993 rows.

    -

    Set the workspace to maprdb:

    +

    Set the workspace to maprdb:

    use maprdb;
     +-------+-------------------------------------+
     |  ok   |               summary               |
    @@ -1222,7 +1222,7 @@ columns typical of a time-series database.

    +-------+-------------------------------------+ 1 row selected
    -

    Describe the tables:

    +

    Describe the tables:

    0: jdbc:drill:> describe customers;
     +--------------+------------------------+--------------+
     | COLUMN_NAME  |       DATA_TYPE        | IS_NULLABLE  |
    @@ -1255,7 +1255,7 @@ structure, and “ANY” represents the fact that the column value can be of any
     data type. Observe the row_key, which is also simply bytes and has the type
     ANY.

    -

    Select 5 rows from the products table:

    +

    Select 5 rows from the products table:

    0: jdbc:drill:> select * from products limit 5;
     +--------------+----------------------------------------------------------------------------------------------------------------+-------------------+
     |   row_key    |                                                    details                                                     |      pricing      |
    @@ -1275,7 +1275,7 @@ and pricing) have the map data type and appear as JSON strings.

    In Lesson 2, you will use CAST functions to return typed data for each column.

    -

    Select 5 rows from the customers table:

    +

    Select 5 rows from the customers table:

    +0: jdbc:drill:> select * from customers limit 5;
     +--------------+-----------------------+-------------------------------------------------+---------------------------------------------------------------------------------------+
     |   row_key    |        address        |                     loyalty                     |                                       personal                                        |
    @@ -1319,7 +1319,7 @@ setup beyond the definition of a workspace.

    Query nested clickstream data

    -

    Set the workspace to dfs.clicks:

    +

    Set the workspace to dfs.clicks:

    0: jdbc:drill:> use dfs.clicks;
     +-------+-----------------------------------------+
     |  ok   |                 summary                 |
    @@ -1340,7 +1340,7 @@ location specified in the workspace. For example:

    relative to this path. The clicks directory referred to in the following query is directly below the nested directory.

    -

    Select 2 rows from the clicks.json file:

    +

    Select 2 rows from the clicks.json file:

    0: jdbc:drill:> select * from `clicks/clicks.json` limit 2;
     +-----------+-------------+-----------+---------------------------------------------------+-------------------------------------------+
     | trans_id  |    date     |   time    |                     user_info                     |                trans_info                 |
    @@ -1358,7 +1358,7 @@ to refer to a file in a local or distributed file system.

    path. This is necessary whenever the file path contains Drill reserved words or characters.

    -

    Select 2 rows from the campaign.json file:

    +

    Select 2 rows from the campaign.json file:

    0: jdbc:drill:> select * from `clicks/clicks.campaign.json` limit 2;
     +-----------+-------------+-----------+---------------------------------------------------+---------------------+----------------------------------------+
     | trans_id  |    date     |   time    |                     user_info                     |       ad_info       |               trans_info               |
    @@ -1392,7 +1392,7 @@ for that month. The total number of records in all log files is 48000.

    are many of these files, but you can use Drill to query them all as a single data source, or to query a subset of the files.

    -

    Set the workspace to dfs.logs:

    +

    Set the workspace to dfs.logs:

    0: jdbc:drill:> use dfs.logs;
     +-------+---------------------------------------+
     |  ok   |                summary                |
    @@ -1401,7 +1401,7 @@ data source, or to query a subset of the files.

    +-------+---------------------------------------+ 1 row selected
    -

    Select 2 rows from the logs directory:

    +

    Select 2 rows from the logs directory:

    0: jdbc:drill:> select * from logs limit 2;
     +-------+-------+-----------+-------------+-----------+----------+---------+--------+----------+-----------+----------+-------------+
     | dir0  | dir1  | trans_id  |    date     |   time    | cust_id  | device  | state  | camp_id  | keywords  | prod_id  | purch_flag  |
    @@ -1420,7 +1420,7 @@ directory path on the file system.

    subdirectories below the logs directory. In Lesson 3, you will do more complex queries that leverage these dynamic variables.

    -

    Find the total number of rows in the logs directory (all files):

    +

    Find the total number of rows in the logs directory (all files):

    0: jdbc:drill:> select count(*) from logs;
     +---------+
     | EXPR$0  |
    @@ -1432,7 +1432,7 @@ queries that leverage these dynamic variables.

    This query traverses all of the files in the logs directory and its subdirectories to return the total number of rows in those files.

    -

    What's Next

    +

    What's Next

    Go to Lesson 2: Run Queries with ANSI SQL.

    http://git-wip-us.apache.org/repos/asf/drill-site/blob/90078fe1/docs/lesson-2-run-queries-with-ansi-sql/index.html ---------------------------------------------------------------------- diff --git a/docs/lesson-2-run-queries-with-ansi-sql/index.html b/docs/lesson-2-run-queries-with-ansi-sql/index.html index 765bb0b..6c8743b 100644 --- a/docs/lesson-2-run-queries-with-ansi-sql/index.html +++ b/docs/lesson-2-run-queries-with-ansi-sql/index.html @@ -1072,7 +1072,7 @@ statement.

    Aggregation

    -

    Set the schema to hive:

    +

    Set the schema to hive:

    0: jdbc:drill:> use hive.`default`;
     +-------+-------------------------------------------+
     |  ok   |                  summary                  |
    @@ -1081,7 +1081,7 @@ statement.

    +-------+-------------------------------------------+ 1 row selected
    -

    Return sales totals by month:

    +

    Return sales totals by month:

    0: jdbc:drill:> select `month`, sum(order_total)
     from orders group by `month` order by 2 desc;
     +------------+---------+
    @@ -1107,7 +1107,7 @@ database queries.

    Note that back ticks are required for the “month” column only because “month” is a reserved word in SQL.

    -

    Return the top 20 sales totals by month and state:

    +

    Return the top 20 sales totals by month and state:

    0: jdbc:drill:> select `month`, state, sum(order_total) as sales from orders group by `month`, state
     order by 3 desc limit 20;
     +-----------+--------+---------+
    @@ -1143,7 +1143,7 @@ aliases and table aliases.

    This query uses the HAVING clause to constrain an aggregate result.

    -

    Set the workspace to dfs.clicks

    +

    Set the workspace to dfs.clicks

    0: jdbc:drill:> use dfs.clicks;
     +-------+-----------------------------------------+
     |  ok   |                 summary                 |
    @@ -1152,7 +1152,7 @@ aliases and table aliases.

    +-------+-----------------------------------------+ 1 row selected
    -

    Return total number of clicks for devices that indicate high click-throughs:

    +

    Return total number of clicks for devices that indicate high click-throughs:

    0: jdbc:drill:> select t.user_info.device, count(*) from `clicks/clicks.json` t 
     group by t.user_info.device
     having count(*) > 1000;
    @@ -1195,7 +1195,7 @@ duplicate rows from those files): clicks.campaign.json and cl
     
     

    Subqueries

    -

    Set the workspace to hive:

    +

    Set the workspace to hive:

    0: jdbc:drill:> use hive.`default`;
     +-------+-------------------------------------------+
     |  ok   |                  summary                  |
    @@ -1204,7 +1204,7 @@ duplicate rows from those files): clicks.campaign.json and cl
     +-------+-------------------------------------------+
     1 row selected
     
    -

    Compare order totals across states:

    +

    Compare order totals across states:

    0: jdbc:drill:> select ny_sales.cust_id, ny_sales.total_orders, ca_sales.total_orders
     from
     (select o.cust_id, sum(o.order_total) as total_orders from hive.orders o where state = 'ny' group by o.cust_id) ny_sales
    @@ -1242,7 +1242,7 @@ limit 20;
     
     

    CAST Function

    -

    Use the maprdb workspace:

    +

    Use the maprdb workspace:

    0: jdbc:drill:> use maprdb;
     +-------+-------------------------------------+
     |  ok   |               summary               |
    @@ -1275,7 +1275,7 @@ from customers t limit 5;
     
  5. The table alias t is required; otherwise the column family names would be parsed as table names and the query would return an error.
  6. -

    Remove the quotes from the strings:

    +

    Remove the quotes from the strings:

    You can use the regexp_replace function to remove the quotes around the strings in the query results. For example, to return a state name va instead @@ -1298,7 +1298,7 @@ from customers t limit 1; +-------+----------------------------------------+ 1 row selected

    -

    Use a mutable workspace:

    +

    Use a mutable workspace:

    A mutable (or writable) workspace is a workspace that is enabled for “write” operations. This attribute is part of the storage plugin configuration. You @@ -1337,7 +1337,7 @@ statement.

    defined in data sources such as Hive, HBase, and the file system. Drill also supports the creation of metadata in the file system.

    -

    Query data from the view:

    +

    Query data from the view:

    0: jdbc:drill:> select * from custview limit 1;
     +----------+-------------------+-----------+----------+--------+----------+-------------+
     | cust_id  |       name        |  gender   |   age    | state  | agg_rev  | membership  |
    @@ -1352,7 +1352,7 @@ supports the creation of metadata in the file system.

    Continue using dfs.views for this query.

    -

    Join the customers view and the orders table:

    +

    Join the customers view and the orders table:

    0: jdbc:drill:> select membership, sum(order_total) as sales from hive.orders, custview
     where orders.cust_id=custview.cust_id
     group by membership order by 2;
    @@ -1378,7 +1378,7 @@ rows are wide, set the maximum width of the display to 10000:

    Do not use a semicolon for this SET command.

    -

    Join the customers, orders, and clickstream data:

    +

    Join the customers, orders, and clickstream data:

    0: jdbc:drill:> select custview.membership, sum(orders.order_total) as sales from hive.orders, custview,
     dfs.`/mapr/demo.mapr.com/data/nested/clicks/clicks.json` c 
     where orders.cust_id=custview.cust_id and orders.cust_id=c.user_info.cust_id 
    @@ -1408,7 +1408,7 @@ hive.orders table is also visible to the query.

    workspace, so the query specifies the full path to the file:

    dfs.`/mapr/demo.mapr.com/data/nested/clicks/clicks.json`
     
    -

    What's Next

    +

    What's Next

    Go to Lesson 3: Run Queries on Complex Data Types.

    http://git-wip-us.apache.org/repos/asf/drill-site/blob/90078fe1/docs/lesson-3-run-queries-on-complex-data-types/index.html ---------------------------------------------------------------------- diff --git a/docs/lesson-3-run-queries-on-complex-data-types/index.html b/docs/lesson-3-run-queries-on-complex-data-types/index.html index 07ba8c6..3f455d0 100644 --- a/docs/lesson-3-run-queries-on-complex-data-types/index.html +++ b/docs/lesson-3-run-queries-on-complex-data-types/index.html @@ -1083,7 +1083,7 @@ exist. Here is a visual example of how this works:

    drill query flow

    -

    Set workspace to dfs.logs:

    +

    Set workspace to dfs.logs:

    0: jdbc:drill:> use dfs.logs;
     +-------+---------------------------------------+
     |  ok   |                summary                |
    @@ -1092,7 +1092,7 @@ exist. Here is a visual example of how this works:

    +-------+---------------------------------------+ 1 row selected
    -

    Query logs data for a specific year:

    +

    Query logs data for a specific year:

    0: jdbc:drill:> select * from logs where dir0='2013' limit 10;
     +-------+-------+-----------+-------------+-----------+----------+---------+--------+----------+-----------+----------+-------------+
     | dir0  | dir1  | trans_id  |    date     |   time    | cust_id  | device  | state  | camp_id  | keywords  | prod_id  | purch_flag  |
    @@ -1114,7 +1114,7 @@ exist. Here is a visual example of how this works:

    dir0 refers to the first level down from logs, dir1 to the next level, and so on. So this query returned 10 of the rows for February 2013.

    -

    Further constrain the results using multiple predicates in the query:

    +

    Further constrain the results using multiple predicates in the query:

    This query returns a list of customer IDs for people who made a purchase via an IOS5 device in August 2013.

    @@ -1131,7 +1131,7 @@ order by `date`; ...
    -

    Return monthly counts per customer for a given year:

    +

    Return monthly counts per customer for a given year:

    0: jdbc:drill:> select cust_id, dir1 month_no, count(*) month_count from logs
     where dir0=2014 group by cust_id, dir1 order by cust_id, month_no limit 10;
     +----------+-----------+--------------+
    @@ -1159,7 +1159,7 @@ year: 2014.

    analyze nested data natively without transformation. If you are familiar with JavaScript notation, you will already know how some of these extensions work.

    -

    Set the workspace to dfs.clicks:

    +

    Set the workspace to dfs.clicks:

    0: jdbc:drill:> use dfs.clicks;
     +-------+-----------------------------------------+
     |  ok   |                 summary                 |
    @@ -1168,7 +1168,7 @@ JavaScript notation, you will already know how some of these extensions work.

    -

    Explore clickstream data:

    +

    Explore clickstream data:

    Note that the user_info and trans_info columns contain nested data: arrays and arrays within arrays. The following queries show how to access this complex @@ -1185,7 +1185,7 @@ data.

    +-----------+-------------+-----------+---------------------------------------------------+---------------------------------------------------------------------------+ 5 rows selected
    -

    Unpack the user_info column:

    +

    Unpack the user_info column:

    0: jdbc:drill:> select t.user_info.cust_id as custid, t.user_info.device as device,
     t.user_info.state as state
     from `clicks/clicks.json` t limit 5;
    @@ -1210,7 +1210,7 @@ column name, and cust_id is a nested column name.

    The table alias is required; otherwise column names such as user_info are parsed as table names by the SQL parser.

    -

    Unpack the trans_info column:

    +

    Unpack the trans_info column:

    0: jdbc:drill:> select t.trans_info.prod_id as prodid, t.trans_info.purch_flag as
     purchased
     from `clicks/clicks.json` t limit 5;
    @@ -1243,7 +1243,7 @@ notation to write interesting queries against nested array data.

    refers to the 21st value, assuming one exists.

    -

    Find the first product that is searched for in each transaction:

    +

    Find the first product that is searched for in each transaction:

    0: jdbc:drill:> select t.trans_id, t.trans_info.prod_id[0] from `clicks/clicks.json` t limit 5;
     +------------+------------+
     |  trans_id  |   EXPR$1   |
    @@ -1256,7 +1256,7 @@ notation to write interesting queries against nested array data.

    +------------+------------+ 5 rows selected
    -

    For which transactions did customers search on at least 21 products?

    +

    For which transactions did customers search on at least 21 products?

    0: jdbc:drill:> select t.trans_id, t.trans_info.prod_id[20]
     from `clicks/clicks.json` t
     where t.trans_info.prod_id[20] is not null
    @@ -1275,7 +1275,7 @@ order by trans_id limit 5;
     

    This query returns transaction IDs and product IDs for records that contain a non-null product ID at the 21st position in the array.

    -

    Return clicks for a specific product range:

    +

    Return clicks for a specific product range:

    0: jdbc:drill:> select * from (select t.trans_id, t.trans_info.prod_id[0] as prodid,
     t.trans_info.purch_flag as purchased
     from `clicks/clicks.json` t) sq
    @@ -1298,7 +1298,7 @@ ordered list of products purchased rather than a random list).

    Perform Operations on Arrays

    -

    Rank successful click conversions and count product searches for each session:

    +

    Rank successful click conversions and count product searches for each session:

    0: jdbc:drill:> select t.trans_id, t.`date` as session_date, t.user_info.cust_id as
     cust_id, t.user_info.device as device, repeated_count(t.trans_info.prod_id) as
     prod_count, t.trans_info.purch_flag as purch_flag
    @@ -1324,7 +1324,7 @@ in descending order. Only clicks that have resulted in a purchase are counted.To facilitate additional analysis on this result set, you can easily and
     quickly create a Drill table from the results of the query.

    -

    Continue to use the dfs.clicks workspace

    +

    Continue to use the dfs.clicks workspace

    0: jdbc:drill:> use dfs.clicks;
     +-------+-----------------------------------------+
     |  ok   |                 summary                 |
    @@ -1333,7 +1333,7 @@ quickly create a Drill table from the results of the query.

    +-------+-----------------------------------------+ 1 row selected (1.61 seconds)
    -

    Return product searches for high-value customers:

    +

    Return product searches for high-value customers:

    0: jdbc:drill:> select o.cust_id, o.order_total, t.trans_info.prod_id[0] as prod_id
     from 
     hive.orders as o
    @@ -1357,7 +1357,7 @@ where o.order_total > (select avg(inord.order_total)
     

    This query returns a list of products that are being searched for by customers who have made transactions that are above the average in their states.

    -

    Materialize the result of the previous query:

    +

    Materialize the result of the previous query:

    0: jdbc:drill:> create table product_search as select o.cust_id, o.order_total, t.trans_info.prod_id[0] as prod_id
     from
     hive.orders as o
    @@ -1379,7 +1379,7 @@ query returns (107,482) and stores them in the format specified by the storage
     plugin (Parquet format in this example). You can create tables that store data
     in csv, parquet, and json formats.

    -

    Query the new table to verify the row count:

    +

    Query the new table to verify the row count:

    This example simply checks that the CTAS statement worked by verifying the number of rows in the table.

    @@ -1391,7 +1391,7 @@ number of rows in the table.

    +---------+ 1 row selected (0.155 seconds)
    -

    Find the storage file for the table:

    +

    Find the storage file for the table:

    [root@maprdemo product_search]# cd /mapr/demo.mapr.com/data/nested/product_search
     [root@maprdemo product_search]# ls -la
     total 451
    @@ -1405,7 +1405,7 @@ stored in the location defined by the dfs.clicks workspace:

    There is a subdirectory that has the same name as the table you created.

    -

    What's Next

    +

    What's Next

    Complete the tutorial with the Summary.

    http://git-wip-us.apache.org/repos/asf/drill-site/blob/90078fe1/docs/mongodb-storage-plugin/index.html ---------------------------------------------------------------------- diff --git a/docs/mongodb-storage-plugin/index.html b/docs/mongodb-storage-plugin/index.html index a1d0332..c8c7b36 100644 --- a/docs/mongodb-storage-plugin/index.html +++ b/docs/mongodb-storage-plugin/index.html @@ -1225,7 +1225,7 @@ Drill data sources, including MongoDB.

    | -72.576142 | +------------+
    -

    Using ODBC/JDBC Drivers

    +

    Using ODBC/JDBC Drivers

    You can query MongoDB through standard BI tools, such as Tableau and SQuirreL. For information about Drill ODBC and JDBC drivers, refer to Drill Interfaces.

    http://git-wip-us.apache.org/repos/asf/drill-site/blob/90078fe1/docs/odbc-configuration-reference/index.html ---------------------------------------------------------------------- diff --git a/docs/odbc-configuration-reference/index.html b/docs/odbc-configuration-reference/index.html index 19408f9..5e00e22 100644 --- a/docs/odbc-configuration-reference/index.html +++ b/docs/odbc-configuration-reference/index.html @@ -1343,7 +1343,7 @@ The Simba ODBC Driver for Apache Drill produces two log files at the location yo
  7. Save the mapr.drillodbc.ini configuration file.
-

What's Next? Go to Connecting to ODBC Data Sources.

+

What's Next? Go to Connecting to ODBC Data Sources.

http://git-wip-us.apache.org/repos/asf/drill-site/blob/90078fe1/docs/parquet-format/index.html ---------------------------------------------------------------------- diff --git a/docs/parquet-format/index.html b/docs/parquet-format/index.html index 9db090c..0eadaa1 100644 --- a/docs/parquet-format/index.html +++ b/docs/parquet-format/index.html @@ -1112,7 +1112,7 @@
  • In the CTAS command, cast JSON string data to corresponding SQL types.
  • -

    Example: Read JSON, Write Parquet

    +

    Example: Read JSON, Write Parquet

    This example demonstrates a storage plugin definition, a sample row of data from a JSON file, and a Drill query that writes the JSON input to Parquet output.

    http://git-wip-us.apache.org/repos/asf/drill-site/blob/90078fe1/docs/querying-hbase/index.html ---------------------------------------------------------------------- diff --git a/docs/querying-hbase/index.html b/docs/querying-hbase/index.html index 2432d2c..5d2c84c 100644 --- a/docs/querying-hbase/index.html +++ b/docs/querying-hbase/index.html @@ -1059,7 +1059,7 @@ How to use optimization features in Drill 1.2 and later
    How to use Drill 1.2 to leverage new features introduced by HBASE-8201 Jira -

    Tutorial--Querying HBase Data

    +

    Tutorial--Querying HBase Data

    This tutorial shows how to connect Drill to an HBase data source, create simple HBase tables, and query the data using Drill.

    http://git-wip-us.apache.org/repos/asf/drill-site/blob/90078fe1/docs/querying-json-files/index.html ---------------------------------------------------------------------- diff --git a/docs/querying-json-files/index.html b/docs/querying-json-files/index.html index 2f7db2a..9d76356 100644 --- a/docs/querying-json-files/index.html +++ b/docs/querying-json-files/index.html @@ -1050,7 +1050,7 @@

    To query complex JSON files, you need to understand the "JSON Data Model". This section provides a trivial example of querying a sample file that Drill installs.

    -

    About the employee.json File

    +

    About the employee.json File

    The sample file, employee.json, is packaged in the Foodmart data JAR in Drill's classpath:

    http://git-wip-us.apache.org/repos/asf/drill-site/blob/90078fe1/docs/querying-plain-text-files/index.html ---------------------------------------------------------------------- diff --git a/docs/querying-plain-text-files/index.html b/docs/querying-plain-text-files/index.html index 2b7f459..46641a2 100644 --- a/docs/querying-plain-text-files/index.html +++ b/docs/querying-plain-text-files/index.html @@ -1079,7 +1079,7 @@ found" error if references to files in queries do not match these condition "delimiter": "|" }
    -

    SELECT * FROM a CSV File

    +

    SELECT * FROM a CSV File

    The first query selects rows from a .csv text file. The file contains seven records:

    @@ -1110,7 +1110,7 @@ each row.

    +-----------------------------------+ 7 rows selected (0.089 seconds)
    -

    Columns[n] Syntax

    +

    Columns[n] Syntax

    You can use the COLUMNS[n] syntax in the SELECT list to return these CSV rows in a more readable, column by column, format. (This syntax uses a zero- http://git-wip-us.apache.org/repos/asf/drill-site/blob/90078fe1/docs/querying-sequence-files/index.html ---------------------------------------------------------------------- diff --git a/docs/querying-sequence-files/index.html b/docs/querying-sequence-files/index.html index 217fcd4..44d4794 100644 --- a/docs/querying-sequence-files/index.html +++ b/docs/querying-sequence-files/index.html @@ -1051,7 +1051,7 @@

    Sequence files are flat files storing binary key value pairs. Drill projects sequence files as table with two columns 'binary_key', 'binary_value'.

    -

    Querying sequence file.

    +

    Querying sequence file.

    Start drill shell

        SELECT *
    
    http://git-wip-us.apache.org/repos/asf/drill-site/blob/90078fe1/docs/querying-system-tables/index.html
    ----------------------------------------------------------------------
    diff --git a/docs/querying-system-tables/index.html b/docs/querying-system-tables/index.html
    index d5f9ec6..ade0399 100644
    --- a/docs/querying-system-tables/index.html
    +++ b/docs/querying-system-tables/index.html
    @@ -1108,7 +1108,7 @@ requests.

    Query the drillbits, version, options, boot, threads, and memory tables in the sys database.

    -

    Query the drillbits table.

    +

    Query the drillbits table.

    0: jdbc:drill:zk=10.10.100.113:5181> select * from drillbits;
     +-------------------+------------+--------------+------------+---------+
     |   hostname        |  user_port | control_port | data_port  |  current|
    @@ -1136,7 +1136,7 @@ True means the Drillbit is connected to the session or client running the
     query. This Drillbit is the Foreman for the current session.
    -

    Query the version table.

    +

    Query the version table.

    0: jdbc:drill:zk=10.10.100.113:5181> select * from version;
     +-------------------------------------------+--------------------------------------------------------------------+----------------------------+--------------+----------------------------+
     |                 commit_id                 |                           commit_message                           |        commit_time         | build_email  |         build_time         |
    @@ -1160,7 +1160,7 @@ example.
     The time that the release was built.
     
     
    -

    Query the options table.

    +

    Query the options table.

    Drill provides system, session, and boot options that you can query.

    @@ -1202,7 +1202,7 @@ The default value, which is of the double, float, or long double data type; otherwise, null. -

    Query the boot table.

    +

    Query the boot table.

    0: jdbc:drill:zk=10.10.100.113:5181> select * from boot limit 10;
     +--------------------------------------+----------+-------+---------+------------+-------------------------+-----------+------------+
     |                 name                 |   kind   | type  | status  |  num_val   |       string_val        | bool_val  | float_val  |
    @@ -1240,7 +1240,7 @@ The default value, which is of the double, float, or long double data type;
     otherwise, null.
     
     
    -

    Query the threads table.

    +

    Query the threads table.

    0: jdbc:drill:zk=10.10.100.113:5181> select * from threads;
     +--------------------+------------+----------------+---------------+
     |       hostname     | user_port  | total_threads  | busy_threads  |
    @@ -1263,7 +1263,7 @@ The peak thread count on the node.
     The current number of live threads (daemon and non-daemon) on the node.
     
     
    -

    Query the memory table.

    +

    Query the memory table.

    0: jdbc:drill:zk=10.10.100.113:5181> select * from memory;
     +--------------------+------------+---------------+-------------+-----------------+---------------------+-------------+
     |       hostname     | user_port  | heap_current  |  heap_max   | direct_current  | jvm_direct_current  | direct_max  |
    
    http://git-wip-us.apache.org/repos/asf/drill-site/blob/90078fe1/docs/ranking-window-functions/index.html
    ----------------------------------------------------------------------
    diff --git a/docs/ranking-window-functions/index.html b/docs/ranking-window-functions/index.html
    index 30d4367..052ad59 100644
    --- a/docs/ranking-window-functions/index.html
    +++ b/docs/ranking-window-functions/index.html
    @@ -1110,7 +1110,7 @@ The window clauses for the function. The OVER clause cannot contain an explicit
     
     

    The following examples show queries that use each of the ranking window functions in Drill. See Window Functions Examples for information about the data and setup for these examples.

    -

    CUME_DIST()

    +

    CUME_DIST()

    The following query uses the CUME_DIST() window function to calculate the cumulative distribution of sales for each dealer in Q1.

       select dealer_id, sales, cume_dist() over(order by sales) as cumedist from q1_sales;
    @@ -1150,7 +1150,7 @@ The window clauses for the function. The OVER clause cannot contain an explicit
        +------------+-----------------+--------+------------+
        10 rows selected (0.198 seconds)  
     
    -

    NTILE()

    +

    NTILE()

    The following example uses the NTILE window function to divide the Q1 sales into five groups and list the sales in ascending order.

       select emp_mgr, sales, ntile(5) over(order by sales) as ntilerank from q1_sales;
    @@ -1188,7 +1188,7 @@ The window clauses for the function. The OVER clause cannot contain an explicit
        +-----------------+------------+--------+------------+
        10 rows selected (0.312 seconds)
     
    -

    PERCENT_RANK()

    +

    PERCENT_RANK()

    The following query uses the PERCENT_RANK() window function to calculate the percent rank for employee sales in Q1.

       select dealer_id, emp_name, sales, percent_rank() over(order by sales) as perrank from q1_sales; 
    @@ -1208,7 +1208,7 @@ The window clauses for the function. The OVER clause cannot contain an explicit
        +------------+-----------------+--------+---------------------+
        10 rows selected (0.169 seconds)
     
    -

    RANK()

    +

    RANK()

    The following query uses the RANK() window function to rank the employee sales for Q1. The word rank in Drill is a reserved keyword and must be enclosed in back ticks (``).

       select dealer_id, emp_name, sales, rank() over(order by sales) as `rank` from q1_sales;
    @@ -1228,7 +1228,7 @@ The window clauses for the function. The OVER clause cannot contain an explicit
        +------------+-----------------+--------+-------+
        10 rows selected (0.174 seconds)
     
    -

    ROW_NUMBER()

    +

    ROW_NUMBER()

    The following query uses the ROW_NUMBER() window function to number the sales for each dealer_id. The word rownum contains the reserved keyword row and must be enclosed in back ticks (``).

        select dealer_id, emp_name, sales, row_number() over(partition by dealer_id order by sales) as `rownum` from q1_sales;
    
    http://git-wip-us.apache.org/repos/asf/drill-site/blob/90078fe1/docs/rdbms-storage-plugin/index.html
    ----------------------------------------------------------------------
    diff --git a/docs/rdbms-storage-plugin/index.html b/docs/rdbms-storage-plugin/index.html
    index 6782237..4d221f1 100644
    --- a/docs/rdbms-storage-plugin/index.html
    +++ b/docs/rdbms-storage-plugin/index.html
    @@ -1060,7 +1060,7 @@
     
  • Add a new storage configuration to Drill through the web ui. Example configurations for Oracle, SQL Server, MySQL and Postgres are provided below.
  • -

    Example: Working with MySQL

    +

    Example: Working with MySQL

    Drill communicates with MySQL through the JDBC driver using the configuration that you specify in the Web Console or through the REST API.

    http://git-wip-us.apache.org/repos/asf/drill-site/blob/90078fe1/docs/rest-api/index.html ---------------------------------------------------------------------- diff --git a/docs/rest-api/index.html b/docs/rest-api/index.html index 9e7b2f1..814d05c 100644 --- a/docs/rest-api/index.html +++ b/docs/rest-api/index.html @@ -1100,7 +1100,7 @@
    -

    POST /query.json

    +

    POST /query.json

    Submit a query and return results.

    @@ -1141,7 +1141,7 @@
    -

    GET /profiles.json

    +

    GET /profiles.json

    Get the profiles of running and completed queries.

    @@ -1165,7 +1165,7 @@

    -

    GET /profiles/{queryid}.json

    +

    GET /profiles/{queryid}.json

    Get the profile of the query that has the given queryid.

    @@ -1183,7 +1183,7 @@

    -

    GET /profiles/cancel/{queryid}

    +

    GET /profiles/cancel/{queryid}

    Cancel the query that has the given queryid.

    @@ -1206,7 +1206,7 @@
    -

    GET /storage.json

    +

    GET /storage.json

    Get the list of storage plugin names and configurations.

    @@ -1240,7 +1240,7 @@

    -

    GET /storage/{name}.json

    +

    GET /storage/{name}.json

    Get the definition of the named storage plugin.

    @@ -1264,7 +1264,7 @@

    -

    Get /storage/{name}/enable/{val}

    +

    Get /storage/{name}/enable/{val}

    Enable or disable the named storage plugin.

    @@ -1287,7 +1287,7 @@
    -

    POST /storage/{name}.json

    +

    POST /storage/{name}.json

    Create or update a storage plugin configuration.

    @@ -1320,7 +1320,7 @@
    -

    DELETE /storage/{name}.json

    +

    DELETE /storage/{name}.json

    Delete a storage plugin configuration.

    @@ -1343,7 +1343,7 @@
    -

    GET /stats.json

    +

    GET /stats.json

    Get Drillbit information, such as ports numbers.

    @@ -1374,7 +1374,7 @@

    -

    GET /status

    +

    GET /status

    Get the status of Drill.

    @@ -1394,7 +1394,7 @@

    -

    GET /status/metrics

    +

    GET /status/metrics

    Get the current memory metrics.

    @@ -1413,7 +1413,7 @@
    -

    GET /status/threads

    +

    GET /status/threads

    Get the status of threads.

    @@ -1444,7 +1444,7 @@
    -

    GET /options.json

    +

    GET /options.json

    List the name, default, and data type of the system and session options.

    http://git-wip-us.apache.org/repos/asf/drill-site/blob/90078fe1/docs/s3-storage-plugin/index.html ---------------------------------------------------------------------- diff --git a/docs/s3-storage-plugin/index.html b/docs/s3-storage-plugin/index.html index 793d915..3d14e6d 100644 --- a/docs/s3-storage-plugin/index.html +++ b/docs/s3-storage-plugin/index.html @@ -1054,7 +1054,7 @@

    There are two simple steps to follow: (1) provide your AWS credentials (2) configure S3 storage plugin with S3 bucket

    -

    (1) AWS credentials

    +

    (1) AWS credentials

    To enable Drill's S3a support, edit the file conf/core-site.xml in your Drill install directory, replacing the text ENTER_YOUR_ACESSKEY and ENTER_YOUR_SECRETKEY with your AWS credentials.

    <configuration>
    @@ -1071,7 +1071,7 @@
     
     </configuration>
     
    -

    (2) Configure S3 Storage Plugin

    +

    (2) Configure S3 Storage Plugin

    Enable S3 storage plugin if you already have one configured or you can add a new plugin by following these steps:

    http://git-wip-us.apache.org/repos/asf/drill-site/blob/90078fe1/docs/sequence-files/index.html ---------------------------------------------------------------------- diff --git a/docs/sequence-files/index.html b/docs/sequence-files/index.html index bdf399d..6253ce2 100644 --- a/docs/sequence-files/index.html +++ b/docs/sequence-files/index.html @@ -1049,7 +1049,7 @@

    Hadoop Sequence files (https://wiki.apache.org/hadoop/SequenceFile) are flat files storing binary key, value pairs. Drill projects sequence files as table with two columns - 'binary_key', 'binary_value' of type VARBINARY.

    -

    Storage plugin format for sequence files.

    +

    Storage plugin format for sequence files.

    . . .
     "sequencefile": {
       "type": "sequencefile",
    @@ -1059,7 +1059,7 @@ Drill projects sequence files as table with two columns - 'binary_key',
     },
     . . .
     
    -

    Querying sequence file.

    +

    Querying sequence file.

    SELECT *
     FROM dfs.tmp.`simple.seq`
     LIMIT 1;
    
    http://git-wip-us.apache.org/repos/asf/drill-site/blob/90078fe1/docs/sql-extensions/index.html
    ----------------------------------------------------------------------
    diff --git a/docs/sql-extensions/index.html b/docs/sql-extensions/index.html
    index 54aa50b..1dd3eb3 100644
    --- a/docs/sql-extensions/index.html
    +++ b/docs/sql-extensions/index.html
    @@ -1052,7 +1052,7 @@
     
     

    Drill extends the SELECT statement for reading complex, multi-structured data. The extended CREATE TABLE AS provides the capability to write data of complex/multi-structured data types. Drill extends the lexical rules for working with files and directories, such as using back ticks for including file names, directory names, and reserved words in queries. Drill syntax supports using the file system as a persistent store for query profiles and diagnostic information.

    - +

    Drill supports Hive and HBase as a plug-and-play data source. Drill can read tables created in Hive that use data types compatible with Drill. You can query Hive tables without modifications. You can query self-describing data without requiring metadata definitions in the Hive metastore. Primitives, such as JOIN, support columnar operation.

    http://git-wip-us.apache.org/repos/asf/drill-site/blob/90078fe1/docs/starting-drill-in-distributed-mode/index.html ---------------------------------------------------------------------- diff --git a/docs/starting-drill-in-distributed-mode/index.html b/docs/starting-drill-in-distributed-mode/index.html index e347f67..ddf3527 100644 --- a/docs/starting-drill-in-distributed-mode/index.html +++ b/docs/starting-drill-in-distributed-mode/index.html @@ -1056,7 +1056,7 @@
  • Using an Ad-Hoc Connection to Drill
  • -

    Using the drillbit.sh Command

    +

    Using the drillbit.sh Command

    To use Drill in distributed mode, you need to control a Drillbit. If you use Drill in embedded mode, you do not use the drillbit.sh command.

    @@ -1070,7 +1070,7 @@

    You can use a configuration file to start Drill. Using such a file is handy for controlling Drillbits on multiple nodes.

    -

    drillbit.sh Command Syntax

    +

    drillbit.sh Command Syntax

    drillbit.sh [--config <conf-dir>] (start|stop|status|restart|autorestart)

    http://git-wip-us.apache.org/repos/asf/drill-site/blob/90078fe1/docs/tableau-examples/index.html ---------------------------------------------------------------------- diff --git a/docs/tableau-examples/index.html b/docs/tableau-examples/index.html index d438876..7278c4c 100644 --- a/docs/tableau-examples/index.html +++ b/docs/tableau-examples/index.html @@ -1064,7 +1064,7 @@ DSN to a Drill data source and then access the data in Tableau 8.1.

    source data. You define schemas by configuring storage plugins on the Storage tab of the Drill Web Console. Also, the examples assume you enabled the DECIMAL data type in Drill.

    -

    Example: Connect to a Hive Table in Tableau

    +

    Example: Connect to a Hive Table in Tableau

    To access Hive tables in Tableau 8.1, connect to the Hive schema using a DSN and then visualize the data in Tableau.
    @@ -1075,7 +1075,7 @@ and then visualize the data in Tableau.


    -

    Step 1: Create a DSN to a Hive Table

    +

    Step 1: Create a DSN to a Hive Table

    In this step, we will create a DSN that accesses a Hive table.

    @@ -1097,7 +1097,7 @@ In this example, we are connecting to a Zookeeper Quorum. Verify that the Cluste
    -

    Step 2: Connect to Hive Tables in Tableau

    +

    Step 2: Connect to Hive Tables in Tableau

    Now, we can connect to Hive tables.

    @@ -1123,7 +1123,7 @@ configure the connection to the Hive table and click OK.
    -

    Step 3. Visualize the Data in Tableau

    +

    Step 3. Visualize the Data in Tableau

    Once you connect to the data, the columns appear in the Data window. To visualize the data, drag fields from the Data window to the workspace view.

    @@ -1132,7 +1132,7 @@ visualize the data, drag fields from the Data window to the workspace view.

    -

    Example: Connect to Self-Describing Data in Tableau

    +

    Example: Connect to Self-Describing Data in Tableau

    You can connect to self-describing data in Tableau in the following ways:

    @@ -1141,7 +1141,7 @@ visualize the data, drag fields from the Data window to the workspace view.

  • Use Tableau’s Custom SQL to query the self-describing data directly.
  • -

    Option 1. Using a View to Connect to Self-Describing Data

    +

    Option 1. Using a View to Connect to Self-Describing Data

    The following example describes how to create a view of an HBase table and connect to that view in Tableau 8.1. You can also use these steps to access @@ -1152,7 +1152,7 @@ data for other sources such as Hive, Parquet, JSON, TSV, and CSV.

    This example assumes that there is a schema named hbase that contains a table named s_voters and a schema named dfs.default that points to a writable location.

    -

    Step 1. Create a View and a DSN

    +

    Step 1. Create a View and a DSN

    In this step, we will use the ODBC Administrator to access the Drill Explorer where we can create a view of an HBase table. Then, we will use the ODBC @@ -1206,7 +1206,7 @@ view.

  • Click OK to close the ODBC Data Source Administrator.

  • -

    Step 2. Connect to the View from Tableau

    +

    Step 2. Connect to the View from Tableau

    Now, we can connect to the view in Tableau.

    @@ -1229,7 +1229,7 @@ view.

  • In the Data Connection dialog, click Connect Live.
  • -

    Step 3. Visualize the Data in Tableau

    +

    Step 3. Visualize the Data in Tableau

    Once you connect to the data in Tableau, the columns appear in the Data window. To visualize the data, drag fields from the Data window to the @@ -1239,7 +1239,7 @@ workspace view.

    -

    Option 2. Using Custom SQL to Access Self-Describing Data

    +

    Option 2. Using Custom SQL to Access Self-Describing Data

    The following example describes how to use custom SQL to connect to a Parquet file and then visualize the data in Tableau 8.1. You can use the same steps to @@ -1250,7 +1250,7 @@ access data from other sources such as Hive, HBase, JSON, TSV, and CSV.

    This example assumes that there is a schema named dfs.default which contains a parquet file named region.parquet.

    -

    Step 1. Create a DSN to the Parquet File and Preview the Data

    +

    Step 1. Create a DSN to the Parquet File and Preview the Data

    In this step, we will create a DSN that accesses files on the DFS. We will also use Drill Explorer to preview the SQL that we want to use to connect to @@ -1286,7 +1286,7 @@ You can copy this query to file so that you can use it in Tableau.

  • Click OK to close the ODBC Data Source Administrator.
  • -

    Step 2. Connect to a Parquet File in Tableau using Custom SQL

    +

    Step 2. Connect to a Parquet File in Tableau using Custom SQL

    Now, we can create a connection to the Parquet file using the custom SQL.

    @@ -1315,7 +1315,7 @@ You can copy this query to file so that you can use it in Tableau.
  • In the Data Connection dialog, click Connect Live.

  • -

    Step 3. Visualize the Data in Tableau

    +

    Step 3. Visualize the Data in Tableau

    Once you connect to the data, the fields appear in the Data window. To visualize the data, drag fields from the Data window to the workspace view.

    http://git-wip-us.apache.org/repos/asf/drill-site/blob/90078fe1/docs/troubleshooting/index.html ---------------------------------------------------------------------- diff --git a/docs/troubleshooting/index.html b/docs/troubleshooting/index.html index 16dd6c6..08c8667 100644 --- a/docs/troubleshooting/index.html +++ b/docs/troubleshooting/index.html @@ -1157,7 +1157,7 @@ Symptom:

    -

    Access Nested Fields without Table Name/Alias

    +

    Access Nested Fields without Table Name/Alias

    Symptom:

       SELECT x.y …  
    @@ -1231,7 +1231,7 @@ Symptom:   

    Solution: Make sure that the ODBC driver version is compatible with the server version. Driver installation instructions include how to check the driver version. Turn on ODBC driver debug logging to better understand failure.

    -

    JDBC/ODBC Connection Issues with ZooKeeper

    +

    JDBC/ODBC Connection Issues with ZooKeeper

    Symptom: Client cannot resolve ZooKeeper host names for JDBC/ODBC.

    @@ -1255,13 +1255,13 @@ Turn on ODBC driver debug logging to better understand failure.

    Solution: Verify that the column alias does not conflict with the storage type. See Lexical Structures.

    -

    List (Array) Contains Null

    +

    List (Array) Contains Null

    Symptom: UNSUPPORTED_OPERATION ERROR: Null values are not supported in lists by default.

    Solution: Avoid selecting fields that are arrays containing nulls. Change Drill session settings to enable all_text_mode. Set store.json.all_text_mode to true, so Drill treats JSON null values as a string containing the word 'null'.

    -

    SELECT COUNT (*) Takes a Long Time to Run

    +

    SELECT COUNT (*) Takes a Long Time to Run

    Solution: In some cases, the underlying storage format does not have a built-in capability to return a count of records in a table. In these cases, Drill does a full scan of the data to verify the number of records.

    http://git-wip-us.apache.org/repos/asf/drill-site/blob/90078fe1/docs/tutorial-develop-a-simple-function/index.html ---------------------------------------------------------------------- diff --git a/docs/tutorial-develop-a-simple-function/index.html b/docs/tutorial-develop-a-simple-function/index.html index f50185f..2df2ace 100644 --- a/docs/tutorial-develop-a-simple-function/index.html +++ b/docs/tutorial-develop-a-simple-function/index.html @@ -1074,7 +1074,7 @@
    -

    Step 1: Add dependencies

    +

    Step 1: Add dependencies

    First, add the following Drill dependency to your maven project:

     <dependency>
    @@ -1085,7 +1085,7 @@
     

    -

    Step 2: Add annotations to the function template

    +

    Step 2: Add annotations to the function template

    To start implementing the DrillSimpleFunc interface, add the following annotations to the @FunctionTemplate declaration:

    @@ -1121,7 +1121,7 @@

    -

    Step 3: Declare input parameters

    +

    Step 3: Declare input parameters

    The function will be generated dynamically, as you can see in the DrillSimpleFuncHolder, and the input parameters and output holders are defined using holders by annotations. Define the parameters using the @Param annotation.

    @@ -1153,7 +1153,7 @@
    -

    Step 4: Declare the return value type

    +

    Step 4: Declare the return value type

    Also, using the @Output annotation, define the returned value as VarCharHolder type. Because you are manipulating a VarChar, you also have to inject a buffer that Drill uses for the output.

    public class SimpleMaskFunc implements DrillSimpleFunc {
    @@ -1168,7 +1168,7 @@
     

    -

    Step 5: Implement the eval() method

    +

    Step 5: Implement the eval() method

    The MASK function does not require any setup, so you do not need to define the setup() method. Define only the eval() method.

    public void eval() {
    @@ -1236,7 +1236,7 @@
         </executions>
     </plugin>
     
    -

    Add a drill-module.conf File to Resources

    +

    Add a drill-module.conf File to Resources

    Add a drill-module.conf file in the resources folder of your project. The presence of this file tells Drill that your jar contains a custom function. Put the following line in the drill-module.config:

    http://git-wip-us.apache.org/repos/asf/drill-site/blob/90078fe1/docs/useful-research/index.html ---------------------------------------------------------------------- diff --git a/docs/useful-research/index.html b/docs/useful-research/index.html index 168cce9..7d4a0c6 100644 --- a/docs/useful-research/index.html +++ b/docs/useful-research/index.html @@ -1083,7 +1083,7 @@
  • Design Proposal for Drill: http://www.slideshare.net/CamuelGilyadov/apache-drill-14071739
  • -

    Dazo (second generation OpenDremel)

    +

    Dazo (second generation OpenDremel)

    -

    Code generation / Physical plan generation

    +

    Code generation / Physical plan generation

    • http://www.vldb.org/pvldb/vol4/p539-neumann.pdf (SLIDES: http://www.vldb.org/2011/files/slides/research9/rSession9-3.pdf)
    • http://git-wip-us.apache.org/repos/asf/drill-site/blob/90078fe1/docs/using-apache-drill-with-tableau-9-desktop/index.html ---------------------------------------------------------------------- diff --git a/docs/using-apache-drill-with-tableau-9-desktop/index.html b/docs/using-apache-drill-with-tableau-9-desktop/index.html index eaf1531..c1aa412 100644 --- a/docs/using-apache-drill-with-tableau-9-desktop/index.html +++ b/docs/using-apache-drill-with-tableau-9-desktop/index.html @@ -1061,7 +1061,7 @@
      -

      Step 1: Install and Configure the MapR Drill ODBC Driver

      +

      Step 1: Install and Configure the MapR Drill ODBC Driver

      Drill uses standard ODBC connectivity to provide easy data-exploration capabilities on complex, schema-less data sets. For the best experience use the latest release of Apache Drill. For Tableau 9.0 Desktop, Drill Version 0.9 or higher is recommended.

      @@ -1081,13 +1081,13 @@
      -

      Step 2: Install the Tableau Data-connection Customization (TDC) File

      +

      Step 2: Install the Tableau Data-connection Customization (TDC) File

      The MapR Drill ODBC Driver includes a file named MapRDrillODBC.TDC. The TDC file includes customizations that improve ODBC configuration and performance when using Tableau. The MapR Drill ODBC Driver installer automatically installs the TDC file if the installer can find the Tableau installation. If you installed the MapR Drill ODBC Driver first and then installed Tableau, the TDC file is not installed automatically, and you need to install the TDC file manually.


      -

      Step 3: Connect Tableau to Drill via ODBC

      +

      Step 3: Connect Tableau to Drill via ODBC

      Complete the following steps to configure an ODBC data connection:

      @@ -1112,7 +1112,7 @@ Tableau is now connected to Drill, and you can select various tables and views.
      -

      Step 4: Query and Analyze the Data

      +

      Step 4: Query and Analyze the Data

      Tableau Desktop can now use Drill to query various data sources and visualize the information.

      http://git-wip-us.apache.org/repos/asf/drill-site/blob/90078fe1/docs/using-apache-drill-with-tableau-9-server/index.html ---------------------------------------------------------------------- diff --git a/docs/using-apache-drill-with-tableau-9-server/index.html b/docs/using-apache-drill-with-tableau-9-server/index.html index f6f2e17..c03ae6e 100644 --- a/docs/using-apache-drill-with-tableau-9-server/index.html +++ b/docs/using-apache-drill-with-tableau-9-server/index.html @@ -1060,7 +1060,7 @@
      -

      Step 1: Install and Configure the MapR Drill ODBC Driver

      +

      Step 1: Install and Configure the MapR Drill ODBC Driver

      Drill uses standard ODBC connectivity to provide easy data-exploration capabilities on complex, schema-less data sets. The latest release of Apache Drill. For Tableau 9.0 Server, Drill Version 0.9 or higher is recommended.

      @@ -1080,7 +1080,7 @@
      -

      Step 2: Install the Tableau Data-connection Customization (TDC) File

      +

      Step 2: Install the Tableau Data-connection Customization (TDC) File

      The MapR Drill ODBC Driver includes a file named MapRDrillODBC.TDC. The TDC file includes customizations that improve ODBC configuration and performance when using Tableau.

      @@ -1093,7 +1093,7 @@
      -

      Step 3: Publish Tableau Visualizations and Data Sources

      +

      Step 3: Publish Tableau Visualizations and Data Sources

      For collaboration purposes, you can now use Tableau Desktop to publish data sources and visualizations on Tableau Server.

      http://git-wip-us.apache.org/repos/asf/drill-site/blob/90078fe1/docs/using-jdbc-with-squirrel-on-windows/index.html ---------------------------------------------------------------------- diff --git a/docs/using-jdbc-with-squirrel-on-windows/index.html b/docs/using-jdbc-with-squirrel-on-windows/index.html index 5fd2770..cbf6973 100644 --- a/docs/using-jdbc-with-squirrel-on-windows/index.html +++ b/docs/using-jdbc-with-squirrel-on-windows/index.html @@ -1065,7 +1065,7 @@
      -

      Step 1: Getting the Drill JDBC Driver

      +

      Step 1: Getting the Drill JDBC Driver

      The Drill JDBC Driver JAR file must exist in a directory on your Windows machine in order to configure the driver in the SQuirreL client.

      @@ -1084,7 +1084,7 @@ you can locate the driver in the following directory:


      -

      Step 2: Installing and Starting SQuirreL

      +

      Step 2: Installing and Starting SQuirreL

      To install and start SQuirreL, complete the following steps:

      @@ -1097,14 +1097,14 @@ you can locate the driver in the following directory:


      -

      Step 3: Adding the Drill JDBC Driver to SQuirreL

      +

      Step 3: Adding the Drill JDBC Driver to SQuirreL

      To add the Drill JDBC Driver to SQuirreL, define the driver and create a database alias. The alias is a specific instance of the driver configuration. SQuirreL uses the driver definition and alias to connect to Drill so you can access data sources that you have registered with Drill.

      -

      A. Define the Driver

      +

      A. Define the Driver

      To define the Drill JDBC Driver, complete the following steps:

      @@ -1146,7 +1146,7 @@ access data sources that you have registered with Drill.

      drill query flow

      -

      B. Create an Alias

      +

      B. Create an Alias

      To create an alias, complete the following steps:

      @@ -1197,7 +1197,7 @@ access data sources that you have registered with Drill.


      -

      Step 4: Running a Drill Query from SQuirreL

      +

      Step 4: Running a Drill Query from SQuirreL

      Once you have SQuirreL successfully connected to your cluster through the Drill JDBC Driver, you can issue queries from the SQuirreL client. You can run http://git-wip-us.apache.org/repos/asf/drill-site/blob/90078fe1/docs/using-microstrategy-analytics-with-apache-drill/index.html ---------------------------------------------------------------------- diff --git a/docs/using-microstrategy-analytics-with-apache-drill/index.html b/docs/using-microstrategy-analytics-with-apache-drill/index.html index bbe581d..695f466 100644 --- a/docs/using-microstrategy-analytics-with-apache-drill/index.html +++ b/docs/using-microstrategy-analytics-with-apache-drill/index.html @@ -1061,7 +1061,7 @@


      -

      Step 1: Install and Configure the MapR Drill ODBC Driver

      +

      Step 1: Install and Configure the MapR Drill ODBC Driver

      Drill uses standard ODBC connectivity to provide easy data exploration capabilities on complex, schema-less data sets. Verify that the ODBC driver version that you download correlates with the Apache Drill version that you use. Ideally, you should upgrade to the latest version of Apache Drill and the MapR Drill ODBC Driver.

      @@ -1097,7 +1097,7 @@
      -

      Step 2: Install the Drill Object on MicroStrategy Analytics Enterprise

      +

      Step 2: Install the Drill Object on MicroStrategy Analytics Enterprise

      The steps listed in this section were created based on the MicroStrategy Technote for installing DBMS objects which you can reference at:

      @@ -1130,7 +1130,7 @@
      -

      Step 3: Create the MicroStrategy database connection for Apache Drill

      +

      Step 3: Create the MicroStrategy database connection for Apache Drill

      Complete the following steps to use the Database Instance Wizard to create the MicroStrategy database connection for Apache Drill:

      @@ -1149,7 +1149,7 @@
      -

      Step 4: Query and Analyze the Data

      +

      Step 4: Query and Analyze the Data

      This step includes an example scenario that shows you how to use MicroStrategy, with Drill as the database instance, to analyze Twitter data stored as complex JSON documents.

      @@ -1157,7 +1157,7 @@

      The Drill distributed file system plugin is configured to read Twitter data in a directory structure. A view is created in Drill to capture the most relevant maps and nested maps and arrays for the Twitter JSON documents. Refer to Query Data for more information about how to configure and use Drill to work with complex data:

      -

      Part 1: Create a Project

      +

      Part 1: Create a Project

      Complete the following steps to create a project:

      @@ -1175,7 +1175,7 @@
    • Click OK. The new project is created in MicroStrategy Developer.
    • -

      Part 2: Create a Freeform Report to Analyze Data

      +

      Part 2: Create a Freeform Report to Analyze Data

      Complete the following steps to create a Freeform Report and analyze data: