drill-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bridg...@apache.org
Subject [2/2] drill git commit: DRILL-3489
Date Fri, 17 Jul 2015 01:14:17 GMT
DRILL-3489

add units info--bucket per Steven

remove auto per twitter complaint

formatting

minor edit

more formatting

fill in more agg fcns

remove interfaces per DB

add agg func reference

Bridget's stuff


Project: http://git-wip-us.apache.org/repos/asf/drill/repo
Commit: http://git-wip-us.apache.org/repos/asf/drill/commit/73fe89d8
Tree: http://git-wip-us.apache.org/repos/asf/drill/tree/73fe89d8
Diff: http://git-wip-us.apache.org/repos/asf/drill/diff/73fe89d8

Branch: refs/heads/gh-pages
Commit: 73fe89d8ea5daf1bf6b958f5f2ee17ed02c8697c
Parents: 2e5f3dd
Author: Kristine Hahn <khahn@maprtech.com>
Authored: Mon Jul 13 11:55:41 2015 -0700
Committer: Kristine Hahn <khahn@maprtech.com>
Committed: Thu Jul 16 18:08:49 2015 -0700

----------------------------------------------------------------------
 ...ser-impersonation-with-hive-authorization.md | 281 ++++++++++---------
 .../010-configuration-options-introduction.md   |   5 +-
 .../010-connect-a-data-source-introduction.md   |   2 +-
 .../035-plugin-configuration-basics.md          |  14 +-
 .../050-json-data-model.md                      |   9 +-
 .../060-text-files-csv-tsv-psv.md               |  20 +-
 _docs/getting-started/010-drill-introduction.md |   2 +-
 .../020-tableau-examples.md                     |   5 +-
 .../performance-tuning/020-partition-pruning.md |  15 +-
 .../sql-commands/035-partition-by-clause.md     |   4 +-
 .../030-date-time-functions-and-arithmetic.md   |  20 +-
 .../050-aggregate-and-aggregate-statistical.md  | 274 ++++++++++++------
 12 files changed, 370 insertions(+), 281 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/drill/blob/73fe89d8/_docs/configure-drill/076-configuring-user-impersonation-with-hive-authorization.md
----------------------------------------------------------------------
diff --git a/_docs/configure-drill/076-configuring-user-impersonation-with-hive-authorization.md b/_docs/configure-drill/076-configuring-user-impersonation-with-hive-authorization.md
old mode 100644
new mode 100755
index 678f0b3..37158c6
--- a/_docs/configure-drill/076-configuring-user-impersonation-with-hive-authorization.md
+++ b/_docs/configure-drill/076-configuring-user-impersonation-with-hive-authorization.md
@@ -4,29 +4,27 @@ parent: "Configure Drill"
 ---
 As of Drill 1.1, you can enable impersonation in Drill and configure authorization in Hive version 1.0 to authorize access to metadata in the Hive metastore repository and data in the Hive warehouse. Impersonation allows a service to act on behalf of a client while performing the action requested by the client. See [Configuring User Impersonation]({{site.baseurl}}/docs/configuring-user-impersonation).
 
-There are two types of Hive authorizations that you can configure to work with impersonation in Drill: SQL standard based or storage based authorization.
+There are two types of Hive authorizations that you can configure to work with impersonation in Drill: SQL standard based and storage based authorization.  
 
+## SQL Standard Based Authorization  
+
+You can configure Hive SQL standard based authorization in Hive version 1.0 to work with impersonation in Drill 1.1. The SQL standard based authorization model can control which users have access to columns, rows, and views. Users with the appropriate permissions can issue the GRANT and REVOKE statements to manage privileges from Hive.
+
+For more information, see [SQL Standard Based Hive Authorization](https://cwiki.apache.org/confluence/display/HELIX/SQL+Standard+Based+Hive+Authorization).  
 
 ## Storage Based Authorization  
   
-You can configure Hive storage-based authorization in Hive version 1.0 to work with impersonation in Drill 1.1. Hive storage-based authorization is a remote metastore server security feature that uses the underlying file system permissions to determine permissions on databases, tables, and partitions. The unit style read/write permissions or ACLs a user or group has on directories in the file system determine access to data. Because the file system controls access at the directory and file level, storage-based authorization cannot control access to data at the column or view level.
+You can configure Hive storage based authorization in Hive version 1.0 to work with impersonation in Drill 1.1. Hive storage based authorization is a remote metastore server security feature that uses the underlying file system permissions to determine permissions on databases, tables, and partitions. The unit style read/write permissions or ACLs that a user or group has on directories in the file system determine access to data. Because the file system controls access at the directory and file level, storage based authorization cannot control access to data at the column or view level.
 
-You manage user and group privileges through permissions and ACLs in the distributed file system. You manage authorizations through the remote metastore server.
+You manage user and group privileges through permissions and ACLs in the distributed file system. You manage storage based authorization through the remote metastore server to authorize access to data and metadata.
 
-DDL statements that manage permissions, such as GRANT and REVOKE, do not have any effect on permissions in the storage based authorization model.
+DDL statements that manage permissions, such as GRANT and REVOKE, do not affect permissions in the storage based authorization model.
 
 For more information, see [Storage Based Authorization in the Metastore Server](https://cwiki.apache.org/confluence/display/Hive/Storage+Based+Authorization+in+the+Metastore+Server).  
 
-## SQL Standard Based Authorization  
-
-You can configure Hive SQL standard based authorization in Hive version 1.0 to work with impersonation in Drill 1.1. The SQL standard based authorization model can control which users have access to columns, rows, and views. Users with the appropriate permissions can issue the GRANT and REVOKE statements to manage privileges from Hive.
-
-For more information, see [SQL Standard Based Hive Authorization](https://cwiki.apache.org/confluence/display/HELIX/SQL+Standard+Based+Hive+Authorization).  
-
-
 ## Configuration  
 
-Once you determine the Hive authorization model that you want to implement, enable impersonation in Drill. Update hive-site.xml with the relevant parameters for the authorization type. Modify the Hive storage plugin instance in Drill with the relevant settings for the authorization type.  
+Once you determine the Hive authorization model that you want to implement, enable impersonation in Drill, update the `hive-site.xml` file with the relevant parameters for the authorization type, and modify the Hive storage plugin configuration in Drill with the relevant properties for the authorization type.  
 
 ### Prerequisites  
 
@@ -36,33 +34,21 @@ Once you determine the Hive authorization model that you want to implement, enab
 
 ## Step 1: Enabling Drill Impersonation  
 
-Complete the following steps on each Drillbit node to enable user impersonation, and set the [maximum number of chained user hops]({{site.baseurl}}/docs/configuring-user-impersonation/#chained-impersonation) that Drill allows:  
+Modify `<DRILL_HOME>/conf/drill-override.conf` on each Drill node to include the required properties, set the [maximum number of chained user hops]({{site.baseurl}}/docs/configuring-user-impersonation/#chained-impersonation), and restart the Drillbit process.
 
-1. Navigate to `<drill_installation_directory>/conf/` and edit `drill-override.conf`.
-2. Under `drill.exe`, add the following:
+1. Add the following properties to the `drill.exec` block in `drill-override.conf`:  
 
-          drill.exec.impersonation: {
-                enabled: true,
+          drill.exec: {
+           cluster-id: "<drill_cluster_name>",
+           zk.connect: "<hostname>:<port>,<hostname>:<port>,<hostname>:<port>"
+           impersonation: {
+                 enabled: true,
                  max_chained_user_hops: 3
-          }
-
-3. Verify that enabled is set to `"true"`.
-4. Set the maximum number of chained user hops that you want Drill to allow.
-5. (MapR clusters only) Add the following lines to the `drill-env.sh` file:
-   * If the underlying file system is not secure, add the following line:
-   ` export MAPR_IMPERSONATION_ENABLED=true`
-   * If the underlying file system has MapR security enabled, add the following line:
-    `export MAPR_TICKETFILE_LOCATION=/opt/mapr/conf/mapruserticket`  
-   * If you are implementing Hive SQL standard based authorization, and you are running Drill     and Hive in a secure MapR cluster, add the following lines:  
-        `export DRILL_JAVA_OPTS="$DRILL_JAVA_OPTS -Dmapr_sec_enabled=true -Dhadoop.login=maprsasl -Dzookeeper.saslprovider=com.mapr.security.maprsasl.MaprSaslProvider -Dmapr.library.flatclass"`  
-       `export MAPR_IMPERSONATION_ENABLED=true`  
-       `export MAPR_TICKETFILE_LOCATION=/opt/mapr/conf/mapruserticket`
-
-6. Restart the Drillbit process on each Drill node.
-   * In a MapR cluster, run the following command:
-    `maprcli node services -name drill-bits -action restart -nodes <hostname> -f`
-   * In a non-MapR environment, run the following command:  
-     `<DRILLINSTALL_HOME>/bin/drillbit.sh restart`  
+            }
+           }  
+
+2. Issue the following command to restart the Drillbit process on each Drill node:  
+`<DRILLINSTALL_HOME>/bin/drillbit.sh restart`  
 
 ##  Step 2:  Updating hive-site.xml  
 
@@ -73,7 +59,7 @@ Update hive-site.xml with the parameters specific to the type of authorization t
 Add the following required authorization parameters in hive-site.xml to configure storage based authentication:  
 
 **hive.metastore.pre.event.listeners**  
-**Description:** Turns on metastore-side security.  
+**Description:** Enables metastore security.  
 **Value:** org.apache.hadoop.hive.ql.security.authorization.AuthorizationPreEventListener  
 
 **hive.security.metastore.authorization.manager**  
@@ -85,57 +71,75 @@ Add the following required authorization parameters in hive-site.xml to configur
 **Value:** org.apache.hadoop.hive.ql.security.HadoopDefaultMetastoreAuthenticator  
 
 **hive.security.metastore.authorization.auth.reads**  
-**Description:** Tells Hive metastore authorization checks for read access.  
+**Description:** When enabled, Hive metastore authorization checks for read access.  
 **Value:** true  
 
 **hive.metastore.execute.setugi**  
-**Description:** Causes the metastore to execute file system operations using the client's reported user and group permissions. You must set this property on both the client and server sides. If client sets it to true and server sets it to false, the client setting is ignored.  
+**Description:** When enabled, this property causes the metastore to execute DFS operations using the client's reported user and group permissions. This property must be set on both the client and server sides. If the cient and server settings differ, the client setting is ignored.  
 **Value:** true 
 
 **hive.server2.enable.doAs**  
-**Description:** Tells HiveServer2 to execute Hive operations as the user making the calls.  
-**Value:** true 
+**Description:** Tells HiveServer2 to execute Hive operations as the user submitting the query. Must be set to true for the storage based model.  
+**Value:** true
+
+
+
+### Example of hive-site.xml configuration with the required properties for storage based authorization 
+
+       <configuration>
+         <property>
+           <name>hive.metastore.uris</name>
+           <value>thrift://10.10.100.120:9083</value>    
+         </property>  
+       
+         <property>
+           <name>javax.jdo.option.ConnectionURL</name>
+           <value>jdbc:derby:;databaseName=/opt/hive/hive-1.0/bin/metastore_db;create=true</value>    
+         </property>
+       
+         <property>
+           <name>javax.jdo.option.ConnectionDriverName</name>
+           <value>org.apache.derby.jdbc.EmbeddedDriver</value>    
+         </property>
+       
+         <property>
+           <name>hive.metastore.pre.event.listeners</name>
+           <value>org.apache.hadoop.hive.ql.security.authorization.AuthorizationPreEventListener</value>
+         </property>
+       
+         <property>
+           <name>hive.security.metastore.authenticator.manager</name>
+           <value>org.apache.hadoop.hive.ql.security.HadoopDefaultMetastoreAuthenticator</value>
+         </property>
+       
+         <property>
+           <name>hive.security.metastore.authorization.manager</name>
+           <value>org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider</value>
+         </property>
+       
+         <property>
+           <name>hive.security.metastore.authorization.auth.reads</name>
+           <value>true</value>
+         </property>
+       
+         <property>
+           <name>hive.metastore.execute.setugi</name>
+           <value>true</value>
+         </property>
+       
+         <property>
+           <name>hive.server2.enable.doAs</name>
+           <value>true</value>
+         </property>
+       </configuration>
 
 
-
-### Example hive-site.xml Settings for Storage Based Authorization  
-
-       <property>
-          <name>hive.metastore.pre.event.listeners</name>
-          <value>org.apache.hadoop.hive.ql.security.authorization.AuthorizationPreEventListener</value>
-        </property>
-        
-        <property>
-          <name>hive.security.metastore.authenticator.manager</name>
-          <value>org.apache.hadoop.hive.ql.security.HadoopDefaultMetastoreAuthenticator</value>
-        </property>
-        
-        <property>
-          <name>hive.security.metastore.authorization.manager</name>
-          <value>org.apache.hadoop.hive.ql.security.authorization.StorageBasedAuthorizationProvider</value>
-        </property>
-        
-        <property>
-          <name>hive.security.metastore.authorization.auth.reads</name>
-          <value>true</value>
-        </property>
-        
-        <property>
-          <name>hive.metastore.execute.setugi</name>
-          <value>true</value>
-        </property>
-        
-        <property>
-          <name>hive.server2.enable.doAs</name>
-          <value>true</value>
-        </property>  
-
 ## SQL Standard Based Authorization  
 
 Add the following required authorization parameters in hive-site.xml to configure SQL standard based authentication:  
 
 **hive.security.authorization.enabled**  
-**Description:** Enables/disables Hive security authorization.   
+**Description:** Enables Hive security authorization.   
 **Value:** true 
 
 **hive.security.authenticator.manager**  
@@ -147,63 +151,78 @@ Add the following required authorization parameters in hive-site.xml to configur
 **Value:** org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactory  
 
 **hive.server2.enable.doAs**  
-**Description:** Tells HiveServer2 to execute Hive operations as the user making the calls.   
-**Value:** false  
+**Description:** Tells HiveServer2 to execute Hive operations as the user submitting the query. Must be set to false for the storage based model. 
+**Value:** false
 
 **hive.users.in.admin.role**  
 **Description:** A comma separated list of users which gets added to the ADMIN role when the metastore starts up. You can add more uses at any time. Note that a user who belongs to the admin role needs to run the "set role" command before getting the privileges of the admin role, as this role is not in the current roles by default.  
 **Value:** Set to the list of comma-separated users who need to be added to the admin role. 
 
 **hive.metastore.execute.setugi**  
-**Description:** Causes the metastore to execute file system operations using the client's reported user and group permissions. You must set this property on both the client and server side. If the client is set to true and the server is set to false, the client setting is ignored.  
-**Value:** false 
-
-### Example hive-site.xml Settings for SQL Standard Based Authorization   
+**Description:** In unsecure mode, setting this property to true causes the metastore to execute DFS operations using the client's reported user and group permissions. Note: This property must be set on both the client and server sides. This is a best effort property. If the client is set to true and the server is set to false, the client setting is ignored.  
+**Value:** false  
+  
 
-       <property>
-          <name>hive.security.authorization.enabled</name>
-          <value>true</value>
-        </property>
-        
-        <property>
-          <name>hive.security.authenticator.manager</name>
-          <value>org.apache.hadoop.hive.ql.security.SessionStateUserAuthenticator</value>
-        </property>
+### Example of hive-site.xml configuration with the required properties for SQL standard based authorization         
         
-        <property>
-          <name>hive.security.authorization.manager</name>   
-          <value>org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactory</value>
-        </property>
-        
-        <property>
-          <name>hive.server2.enable.doAs</name>
-          <value>false</value>
-        </property>
-        
-        <property>
-          <name>hive.users.in.admin.role</name>
-          <value>userA</value>
-        </property>
-        
-        <property>
-          <name>hive.metastore.execute.setugi</name>
-          <value>false</value>
-        </property>  
+       <configuration>
+         <property>
+           <name>hive.metastore.uris</name>
+           <value>thrift://10.10.100.120:9083</value>    
+         </property> 
+
+         <property>
+           <name>javax.jdo.option.ConnectionURL</name>
+           <value>jdbc:derby:;databaseName=/opt/hive/hive-1.0/bin/metastore_db;create=true</value>    
+         </property>
+       
+         <property>
+           <name>javax.jdo.option.ConnectionDriverName</name>
+           <value>org.apache.derby.jdbc.EmbeddedDriver</value>    
+         </property>  
+
+         <property>
+           <name>hive.security.authorization.enabled</name>
+           <value>true</value>
+         </property>
+       
+         <property>
+           <name>hive.security.authenticator.manager</name>
+           <value>org.apache.hadoop.hive.ql.security.SessionStateUserAuthenticator</value>
+         </property>       
+       
+         <property>
+           <name>hive.security.authorization.manager</name>   
+           <value>org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactory</value>
+         </property>
+       
+         <property>
+           <name>hive.server2.enable.doAs</name>
+           <value>false</value>
+         </property>
+       
+         <property>
+           <name>hive.users.in.admin.role</name>
+           <value>user</value>
+         </property>
+       
+         <property>
+           <name>hive.metastore.execute.setugi</name>
+           <value>false</value>
+         </property>    
+        </configuration>
 
 ## Step 3: Modifying the Hive Storage Plugin  
 
-Modify the Hive storage plugin instance in the Drill Web UI to include specific authorization settings. The Drillbit that you use to access the Web UI must be running. 
-
-{% include startnote.html %}The metastore host port for MapR is typically 9083.{% include endnote.html %}  
+Modify the Hive storage plugin configuration in the Drill Web UI to include specific authorization settings. The Drillbit that you use to access the Web UI must be running.  
 
 Complete the following steps to modify the Hive storage plugin:  
 
 1.  Navigate to `http://<drillbit_hostname>:8047`, and select the **Storage tab**.  
-2.  Click **Update** next to the hive instance.  
-3.  In the configuration window, add the configuration settings for the authorization type. If you are running Drill and Hive in a secure MapR cluster, do not include the line `"hive.metastore.sasl.enabled" : "false"`.  
-
+2.  Click **Update** next to "hive."  
+3.  In the configuration window, add the configuration properties for the authorization type.
   
-   * For storage based authorization, add the following settings:  
+   * For storage based authorization, add the following properties:  
 
               {
                type:"hive",
@@ -216,31 +235,23 @@ Complete the following steps to modify the Hive storage plugin:
                  "hive.metastore.execute.setugi" : "true"
                }
               }  
-   * For SQL standard based authorization, add the following settings:  
-
+   * For SQL standard based authorization, add the following properties:  
+
               {
                type:"hive",
                enabled: true,
                configProps : {
-                 "hive.metastore.uris"
-              : "thrift://<metastore_host>:<port>",
-                 "fs.default.name"
-              : "hdfs://<host>:<port>/",
-                 "hive.security.authorization.enabled"
-              : "true",
-                 "hive.security.authenticator.manager"
-              : "org.apache.hadoop.hive.ql.security.SessionStateUserAuthenticator",
-                 "hive.security.authorization.manager"
-              :
-              "org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactory",
-                 "hive.metastore.sasl.enabled"
-              : "false",
-                 "hive.server2.enable.doAs"
-              : "false",
-                 "hive.metastore.execute.setugi"
-              : "false"
+                 "hive.metastore.uris" : "thrift://<metastore_host>:9083",
+                 "fs.default.name" : "hdfs://<host>:<port>/",
+                 "hive.security.authorization.enabled" : "true",
+                 "hive.security.authenticator.manager" : "org.apache.hadoop.hive.ql.security.SessionStateUserAuthenticator",
+                 "hive.security.authorization.manager" : "org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactory",
+                 "hive.metastore.sasl.enabled" : "false",
+                 "hive.server2.enable.doAs" : "false",
+                 "hive.metastore.execute.setugi" : "false"
                }
               }
+              
 
 
 

http://git-wip-us.apache.org/repos/asf/drill/blob/73fe89d8/_docs/configure-drill/configuration-options/010-configuration-options-introduction.md
----------------------------------------------------------------------
diff --git a/_docs/configure-drill/configuration-options/010-configuration-options-introduction.md b/_docs/configure-drill/configuration-options/010-configuration-options-introduction.md
index 6aebb8c..4f00dd8 100644
--- a/_docs/configure-drill/configuration-options/010-configuration-options-introduction.md
+++ b/_docs/configure-drill/configuration-options/010-configuration-options-introduction.md
@@ -22,8 +22,8 @@ The sys.options table lists the following options that you can set as a system o
 | exec.java_compiler                             | DEFAULT          | Switches between DEFAULT, JDK, and JANINO mode for the current session. Uses Janino by default for generated source code of less than exec.java_compiler_janino_maxsize; otherwise, switches to the JDK compiler.                                                                                                                                                |
 | exec.java_compiler_debug                       | TRUE             | Toggles the output of debug-level compiler error messages in runtime generated code.                                                                                                                                                                                                                                                                             |
 | exec.java_compiler_janino_maxsize              | 262144           | See the exec.java_compiler option comment. Accepts inputs of type LONG.                                                                                                                                                                                                                                                                                          |
-| exec.max_hash_table_size                       | 1073741824       | Ending size for hash tables. Range: 0 - 1073741824.                                                                                                                                                                                                                                                                                                              |
-| exec.min_hash_table_size                       | 65536            | Starting size for hash tables. Increase according to available memory to improve performance. Increasing for very large aggregations or joins when you have large amounts of memory for Drill to use. Range: 0 - 1073741824.                                                                                                                                     |
+| exec.max_hash_table_size                       | 1073741824       | Ending size in buckets for hash tables. Range: 0 - 1073741824.                                                                                                                                                                                                                                                                                                   |
+| exec.min_hash_table_size                       | 65536            | Starting size in bucketsfor hash tables. Increase according to available memory to improve performance. Increasing for very large aggregations or joins when you have large amounts of memory for Drill to use. Range: 0 - 1073741824.                                                                                                                           |
 | exec.queue.enable                              | FALSE            | Changes the state of query queues. False allows unlimited concurrent queries.                                                                                                                                                                                                                                                                                    |
 | exec.queue.large                               | 10               | Sets the number of large queries that can run concurrently in the cluster. Range: 0-1000                                                                                                                                                                                                                                                                         |
 | exec.queue.small                               | 100              | Sets the number of small queries that can run concurrently in the cluster. Range: 0-1001                                                                                                                                                                                                                                                                         |
@@ -79,5 +79,6 @@ The sys.options table lists the following options that you can set as a system o
 | store.parquet.compression                      | snappy           | Compression type for storing Parquet output. Allowed values: snappy, gzip, none                                                                                                                                                                                                                                                                                  |
 | store.parquet.enable_dictionary_encoding       | FALSE            | For internal use. Do not change.                                                                                                                                                                                                                                                                                                                                 |
 | store.parquet.use_new_reader                   | FALSE            | Not supported in this release.                                                                                                                                                                                                                                                                                                                                   |
+| store.partition.hash_distribute                | FALSE            | Uses a hash algorithm to distribute data on partition keys in a CTAS partitioning operation. An alpha option--for experimental use at this stage. Do not use in production systems.                                                                                                                                                                              |
 | store.text.estimated_row_size_bytes            | 100              | Estimate of the row size in a delimited text file, such as csv. The closer to actual, the better the query plan. Used for all csv files in the system/session where the value is set. Impacts the decision to plan a broadcast join or not.                                                                                                                      |
 | window.enable                                  | TRUE             | Enable or disable window functions in Drill 1.1 and later.                                                                                                                                                                                                                                                                                                       |
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/73fe89d8/_docs/connect-a-data-source/010-connect-a-data-source-introduction.md
----------------------------------------------------------------------
diff --git a/_docs/connect-a-data-source/010-connect-a-data-source-introduction.md b/_docs/connect-a-data-source/010-connect-a-data-source-introduction.md
index 40bd15b..2ba46e3 100644
--- a/_docs/connect-a-data-source/010-connect-a-data-source-introduction.md
+++ b/_docs/connect-a-data-source/010-connect-a-data-source-introduction.md
@@ -2,7 +2,7 @@
 title: "Connect a Data Source Introduction"
 parent: "Connect a Data Source"
 ---
-A storage plugin is a software module interface for connecting Drill to data sources. A storage plugin typically optimizes execution of Drill queries, provides the location of the data, and configures the workspace and file formats for reading data. Several storage plugins are installed with Drill that you can configure to suit your environment. Through the storage plugin, Drill connects to a data source, such as a database, a file on a local or distributed file system, or a Hive metastore. 
+A storage plugin is a software module for connecting Drill to data sources. A storage plugin typically optimizes execution of Drill queries, provides the location of the data, and configures the workspace and file formats for reading data. Several storage plugins are installed with Drill that you can configure to suit your environment. Through the storage plugin, Drill connects to a data source, such as a database, a file on a local or distributed file system, or a Hive metastore. 
 
 You can modify the default configuration of a storage plugin X and give the new version a unique name Y. This document refers to Y as a different storage plugin, although it is actually just a reconfiguration of original interface. When you execute a query, Drill gets the storage plugin name in one of several ways:
 

http://git-wip-us.apache.org/repos/asf/drill/blob/73fe89d8/_docs/connect-a-data-source/035-plugin-configuration-basics.md
----------------------------------------------------------------------
diff --git a/_docs/connect-a-data-source/035-plugin-configuration-basics.md b/_docs/connect-a-data-source/035-plugin-configuration-basics.md
index 8b89241..cfa7cde 100644
--- a/_docs/connect-a-data-source/035-plugin-configuration-basics.md
+++ b/_docs/connect-a-data-source/035-plugin-configuration-basics.md
@@ -94,13 +94,7 @@ The following table describes the attributes you configure for storage plugins i
     <td>"formats" . . . "delimiter"</td>
     <td>"\t"<br>","</td>
     <td>format-dependent</td>
-    <td>One or more characters that separate records in a delimited text file, such as CSV. Use a 4-digit hex ascii code syntax \uXXXX for a non-printable delimiter. </td>
-  </tr>
-  <tr>
-    <td>"formats" . . . "fieldDelimiter"</td>
-    <td>","</td>
-    <td>no</td>
-    <td>A single character that separates each value in a column of a delimited text file.</td>
+    <td>One or more characters that serve as a record seperator in a delimited text file, such as CSV. Use a 4-digit hex ascii code syntax \uXXXX for a non-printable delimiter. </td>
   </tr>
   <tr>
     <td>"formats" . . . "quote"</td>
@@ -112,7 +106,7 @@ The following table describes the attributes you configure for storage plugins i
     <td>"formats" . . . "escape"</td>
     <td>"`"</td>
     <td>no</td>
-    <td>A single character that escapes the quote character.</td>
+    <td>A single character that escapes a quotation mark inside a value.</td>
   </tr>
   <tr>
     <td>"formats" . . . "comment"</td>
@@ -124,7 +118,7 @@ The following table describes the attributes you configure for storage plugins i
     <td>"formats" . . . "skipFirstLine"</td>
     <td>true</td>
     <td>no</td>
-    <td>To include or omits the header when reading a delimited text file.
+    <td>To include or omit the header when reading a delimited text file. Set to true to avoid reading headers as data.
     </td>
   </tr>
 </table>
@@ -141,7 +135,7 @@ You can use the following attributes in the `formats` area of the storage plugin
 * quote  
 * skipFirstLine
 
-For more information and examples of using formats for text files, see ["Text Files: CSV, TSV, PSV"]({{site.baseurl}}{{site.baseurl}}/docs/text-files-csv-tsv-psv/)
+For more information and examples of using formats for text files, see ["Text Files: CSV, TSV, PSV"]({{site.baseurl}}{{site.baseurl}}/docs/text-files-csv-tsv-psv/).
 
 ## Using Other Attributes
 

http://git-wip-us.apache.org/repos/asf/drill/blob/73fe89d8/_docs/data-sources-and-file-formats/050-json-data-model.md
----------------------------------------------------------------------
diff --git a/_docs/data-sources-and-file-formats/050-json-data-model.md b/_docs/data-sources-and-file-formats/050-json-data-model.md
index 1e68783..75f47b1 100644
--- a/_docs/data-sources-and-file-formats/050-json-data-model.md
+++ b/_docs/data-sources-and-file-formats/050-json-data-model.md
@@ -104,9 +104,12 @@ Drill performs the following actions, as shown in the complete [CTAS command exa
 
 ## Analyzing JSON
 
-Generally, you query JSON files using the following syntax, which includes a table alias. The alias is typically required for querying complex data:
+Generally, you query JSON files using the following syntax, which includes a table alias. The alias is sometimes required for querying complex data. Because of the ambiguity between y.z where y could be a column or a table,
+Drill currently explicitly requires a table prefix for referencing a field
+inside another field (t.y.z).  This isn't required in the case y, y[z] or
+y[z].x because these references are not ambiguous. Observe the following guidelines:
 
-* Dot notation to drill down into a JSON map.
+* Use dot notation to drill down into a JSON map.
 
         SELECT t.level1.level2. . . . leveln FROM <storage plugin location>`myfile.json` t
 
@@ -120,7 +123,7 @@ Generally, you query JSON files using the following syntax, which includes a tab
 
 Drill returns null when a document does not have the specified map or level.
 
-Using the following techniques, you can query complex, nested JSON:
+Use the following techniques to query complex, nested JSON:
 
 * Flatten nested data
 * Generate key/value pairs for loosely structured data

http://git-wip-us.apache.org/repos/asf/drill/blob/73fe89d8/_docs/data-sources-and-file-formats/060-text-files-csv-tsv-psv.md
----------------------------------------------------------------------
diff --git a/_docs/data-sources-and-file-formats/060-text-files-csv-tsv-psv.md b/_docs/data-sources-and-file-formats/060-text-files-csv-tsv-psv.md
index 11ddb69..bd689be 100644
--- a/_docs/data-sources-and-file-formats/060-text-files-csv-tsv-psv.md
+++ b/_docs/data-sources-and-file-formats/060-text-files-csv-tsv-psv.md
@@ -29,29 +29,15 @@ You can also improve performance by casting the VARCHAR data to INT, FLOAT, DATE
 Using a distributed file system, such as HDFS, instead of a local file system to query the files also improves performance because currently Drill does not split files on block splits.
 
 ## Configuring Drill to Read Text Files
-In the storage plugin configuration, you can set the following attributes that affect how Drill reads CSV, TSV, PSV (comma-, tab-, pipe-separated) files.  
-
-* String lineDelimiter = "\n";  
-  One or more characters used to denote a new record. Allows reading files with windows line endings.  
-* char fieldDelimiter = ',';  
-  A single character used to separate each value.  
-* char quote = '"';  
-  A single character used to start/end a value enclosed in quotation marks.  
-* char escape = '"';  
-  A single character used to escape a quototation mark inside of a value.  
-* char comment = '#';  
-  A single character used to denote a comment line.  
-* boolean skipFirstLine = false;  
-  Set to true to avoid reading headers as data. 
-
-Set the `sys.options` property setting `exec.storage.enable_new_text_reader` to true (the default) before attempting to use these attributes:
+In the storage plugin configuration, you [set the attributes]({{site.baseurl}}/docs/plugin-configuration-basics/#list-of-attributes-and-definitions) that affect how Drill reads CSV, TSV, PSV (comma-, tab-, pipe-separated) files:  
 
 * comment  
 * escape  
-* fieldDeliimiter  
+* deliimiter  
 * quote  
 * skipFirstLine
 
+Set the `sys.options` property setting `exec.storage.enable_new_text_reader` to true (the default) before attempting to use these attributes. 
 
 ## Examples of Querying Text Files
 The examples in this section show the results of querying CSV files that use and do not use a header, include comments, and use an escape character:

http://git-wip-us.apache.org/repos/asf/drill/blob/73fe89d8/_docs/getting-started/010-drill-introduction.md
----------------------------------------------------------------------
diff --git a/_docs/getting-started/010-drill-introduction.md b/_docs/getting-started/010-drill-introduction.md
index 31c9ed0..0a3538b 100644
--- a/_docs/getting-started/010-drill-introduction.md
+++ b/_docs/getting-started/010-drill-introduction.md
@@ -14,7 +14,7 @@ with existing Apache Hive and Apache HBase deployments.
 Many enhancements in Apache Drill 1.1 include the following key features:
 
 * [SQL window functions]({{site.baseurl}}/docs/sql-window-functions)
-* [Automatic partitioning]({{site.baseurl}}) using the new [PARTITION BY]({{site.baseurl}}/docs/partition-by-clause) clause in the CTAS command
+* [Partitioning data]({{site.baseurl}}) using the new [PARTITION BY]({{site.baseurl}}/docs/partition-by-clause) clause in the CTAS command
 * [Delegated Hive impersonation]({{site.baseurl}}/docs/configuring-user-impersonation-with-hive-authorization/)
 * Support for UNION and UNION ALL and better optimized plans that include UNION.
 

http://git-wip-us.apache.org/repos/asf/drill/blob/73fe89d8/_docs/odbc-jdbc-interfaces/using-drill-with-bi-tools/020-tableau-examples.md
----------------------------------------------------------------------
diff --git a/_docs/odbc-jdbc-interfaces/using-drill-with-bi-tools/020-tableau-examples.md b/_docs/odbc-jdbc-interfaces/using-drill-with-bi-tools/020-tableau-examples.md
old mode 100644
new mode 100755
index a69dd7a..10ea9df
--- a/_docs/odbc-jdbc-interfaces/using-drill-with-bi-tools/020-tableau-examples.md
+++ b/_docs/odbc-jdbc-interfaces/using-drill-with-bi-tools/020-tableau-examples.md
@@ -35,7 +35,7 @@ In this step, we will create a DSN that accesses a Hive table.
      The *MapR Drill ODBC Driver DSN Setup* window appears.
   4. Enter a name for the data source.
   5. Specify the connection type based on your requirements. The connection type provides the DSN access to Drill Data Sources.  
-In this example, we are connecting to a Zookeeper Quorum.
+In this example, we are connecting to a Zookeeper Quorum. Verify that the Cluster ID that you use matches the Cluster ID in `<DRILL_HOME>/conf/drill-override.conf`.
   6. In the **Schema** field, select the Hive schema.
      In this example, the Hive schema is named hive.default.
      ![]({{ site.baseurl }}/docs/img/Hive_DSN.png)
@@ -225,6 +225,7 @@ Now, we can create a connection to the Parquet file using the custom SQL.
      The *Generic ODBC Connection* dialog appears.
   3. In the *Connect Using* section, select the DSN that connects to the data source.  
      In this example, Files-DrillDataSources was selected.
+     If you do not see the DSN, close and re-open Tableau.
   4. In the *Schema* section, select the schema associated with the data source.  
      In this example, dfs.default was selected.
   5. In the *Table* section, select **Custom SQL**.
@@ -238,7 +239,7 @@ Now, we can create a connection to the Parquet file using the custom SQL.
      {% include startnote.html %}The path to the file depends on its location in your file system.{% include endnote.html %} 
 
   7. Click **OK** to complete the connection.  
-     ![]({{ site.baseurl }}/docs/img/ODBC_CustomSQL.png)
+     ![]({{ site.baseurl }}/docs/img/ODBC_CustomSQL.png)  
   8. In the *Data Connection dialog*, click **Connect Live**.
 
 #### Step 3. Visualize the Data in Tableau

http://git-wip-us.apache.org/repos/asf/drill/blob/73fe89d8/_docs/performance-tuning/020-partition-pruning.md
----------------------------------------------------------------------
diff --git a/_docs/performance-tuning/020-partition-pruning.md b/_docs/performance-tuning/020-partition-pruning.md
index 5a5669e..c86620c 100755
--- a/_docs/performance-tuning/020-partition-pruning.md
+++ b/_docs/performance-tuning/020-partition-pruning.md
@@ -9,18 +9,17 @@ The query planner in Drill performs partition pruning by evaluating the filters.
 
 ## How to Partition Data
 
-You can partition data manually or automatically to take advantage of partition pruning in Drill. In Drill 1.0 and earlier, you need to organize your data in such a way to take advantage of partition pruning. In Drill 1.1.0 and later, if the data source is Parquet, you can partition data automatically using CTAS--no data organization tasks required. 
+In Drill 1.1.0 and later, if the data source is Parquet, no data organization tasks are required to take advantage of partition pruning. Write Parquet data using the [PARTITION BY]({{site.baseurl}}/docs/partition-by-clause/) clause in the CTAS statement. 
 
-## Automatic Partitioning
-Automatic partitioning in Drill 1.1 and later occurs when you write Parquet data using the [PARTITION BY]({{site.baseurl}}/docs/partition-by-clause/) clause in the CTAS statement. Unlike manual partitioning, no view is required, nor is it necessary to use the [dir* variables]({{site.baseurl}}/docs/querying-directories). The Parquet writer first sorts by the partition keys, and then creates a new file when it encounters a new value for the partition columns.
-
-Automatic partitioning creates separate files, but not separate directories, for different partitions. Each file contains exactly one partition value, but there can be multiple files for the same partition value.
+The Parquet writer first sorts data by the partition keys, and then creates a new file when it encounters a new value for the partition columns. During partitioning, Drill creates separate files, but not separate directories, for different partitions. Each file contains exactly one partition value, but there can be multiple files for the same partition value. 
 
 Partition pruning uses the Parquet column statistics to determine which columns to use to prune. 
 
-## Manual Partitioning
+Unlike using the Drill 1.0 partitioning, no view query is subsequently required, nor is it necessary to use the [dir* variables]({{site.baseurl}}/docs/querying-directories) after you use the Drill 1.1 PARTITION BY clause in a CTAS statement. 
+
+## Drill 1.0 Partitioning
 
-Manual partitioning is directory-based. You perform the following steps to manually partition data.   
+You perform the following steps to partition data in Drill 1.0.   
  
 1. Devise a logical way to store the data in a hierarchy of directories. 
 2. Use CTAS to create Parquet files from the original data, specifying filter conditions.
@@ -28,7 +27,7 @@ Manual partitioning is directory-based. You perform the following steps to manua
 
 After partitioning the data, you need to create a view of the partitioned data to query the data. You can use the [dir* variables]({{site.baseurl}}/docs/querying-directories) in queries to refer to subdirectories in your workspace path.
  
-### Manual Partitioning Example
+### Drill 1.0 Partitioning Example
 
 Suppose you have text files containing several years of log data. To partition the data by year and quarter, create the following hierarchy of directories:  
        

http://git-wip-us.apache.org/repos/asf/drill/blob/73fe89d8/_docs/sql-reference/sql-commands/035-partition-by-clause.md
----------------------------------------------------------------------
diff --git a/_docs/sql-reference/sql-commands/035-partition-by-clause.md b/_docs/sql-reference/sql-commands/035-partition-by-clause.md
index 785d00d..5bf97d6 100644
--- a/_docs/sql-reference/sql-commands/035-partition-by-clause.md
+++ b/_docs/sql-reference/sql-commands/035-partition-by-clause.md
@@ -2,7 +2,7 @@
 title: "PARTITION BY Clause"
 parent: "SQL Commands"
 ---
-The PARTITION BY clause in the CTAS command automatically partitions data, which Drill [prunes]({{site.baseurl}}/docs/partition-pruning/) to improve performance when you query the data. (Drill 1.1.0)
+The PARTITION BY clause in the CTAS command partitions data, which Drill [prunes]({{site.baseurl}}/docs/partition-pruning/) to improve performance when you query the data. (Drill 1.1.0)
 
 ## Syntax
 
@@ -10,7 +10,7 @@ The PARTITION BY clause in the CTAS command automatically partitions data, which
 
 The PARTITION BY clause partitions the data by the first column_name, and then subpartitions the data by the next column_name, if there is one, and so on. 
 
-Only the Parquet storage format is supported for automatic partitioning. Before using CTAS, [set the `store.format` option]({{site.baseurl}}/docs/create-table-as-ctas/#setting-the-storage-format) for the table to Parquet.
+Only the Parquet storage format is supported for partitioning. Before using CTAS, [set the `store.format` option]({{site.baseurl}}/docs/create-table-as-ctas/#setting-the-storage-format) for the table to Parquet.
 
 When the base table in the SELECT statement is schema-less, include columns in the PARTITION BY clause in the table's column list, or use a select all (SELECT *) statement:  
 

http://git-wip-us.apache.org/repos/asf/drill/blob/73fe89d8/_docs/sql-reference/sql-functions/030-date-time-functions-and-arithmetic.md
----------------------------------------------------------------------
diff --git a/_docs/sql-reference/sql-functions/030-date-time-functions-and-arithmetic.md b/_docs/sql-reference/sql-functions/030-date-time-functions-and-arithmetic.md
index 32ccc8e..a03f9c7 100644
--- a/_docs/sql-reference/sql-functions/030-date-time-functions-and-arithmetic.md
+++ b/_docs/sql-reference/sql-functions/030-date-time-functions-and-arithmetic.md
@@ -523,20 +523,21 @@ Is the time 2:00 PM?
 
 ## UNIX_TIMESTAMP
 
- Returns UNIX Epoch time, which is the number of seconds elapsed since January 1, 1970.
+Returns UNIX Epoch time, which is the number of seconds elapsed since January 1, 1970.
 
- ### UNIX_TIMESTAMP Syntax
+### UNIX_TIMESTAMP Syntax
 
-UNIX_TIMESTAMP()
-UNIX_TIMESTAMP(string date)
-UNIX_TIMESTAMP(string date, string pattern)
+    UNIX_TIMESTAMP()  
+    UNIX_TIMESTAMP(string date)  
+    UNIX_TIMESTAMP(string date, string pattern)  
 
 These functions perform the following operations, respectively:
 
-* Gets current Unix timestamp in seconds if given no arguments. 
-* Converts the time string in format yyyy-MM-dd HH:mm:ss to a Unix timestamp in seconds using the default timezone and locale.
-* Converts the time string with the given pattern to a Unix time stamp in seconds.
+* Gets current Unix timestamp in seconds if given no arguments.  
+* Converts the time string in format yyyy-MM-dd HH:mm:ss to a Unix timestamp in seconds using the default timezone and locale.  
+* Converts the time string with the given pattern to a Unix time stamp in seconds.  
 
+```
 SELECT UNIX_TIMESTAMP FROM sys.version;
 +-------------+
 |   EXPR$0    |
@@ -567,4 +568,5 @@ SELECT UNIX_TIMESTAMP('2015-05-29 08:18:53.0', 'yyyy-MM-dd HH:mm:ss.SSS') FROM s
 +-------------+
 | 1432912733  |
 +-------------+
-1 row selected (0.171 seconds)
\ No newline at end of file
+1 row selected (0.171 seconds)
+```
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/73fe89d8/_docs/sql-reference/sql-functions/050-aggregate-and-aggregate-statistical.md
----------------------------------------------------------------------
diff --git a/_docs/sql-reference/sql-functions/050-aggregate-and-aggregate-statistical.md b/_docs/sql-reference/sql-functions/050-aggregate-and-aggregate-statistical.md
index cf76b5e..1ce7def 100644
--- a/_docs/sql-reference/sql-functions/050-aggregate-and-aggregate-statistical.md
+++ b/_docs/sql-reference/sql-functions/050-aggregate-and-aggregate-statistical.md
@@ -21,33 +21,77 @@ SUM(expression)| SMALLINT, INTEGER, BIGINT, FLOAT, DOUBLE, DECIMAL, INTERVALDAY,
 
 AVG, COUNT, MIN, MAX, and SUM accept ALL and DISTINCT keywords. The default is ALL.
 
+These examples of aggregate functions use the `cp` storage plugin to access a JSON file installed with Drill. By default, JSON reads numbers as double-precision floating point numbers. These examples assume that you are using the default option [all_text_mode]({{site.baseurl}}/docs/json-data-model/#handling-type-differences) set to false.
+
 ## AVG 
 
 Averages a column of all records in a data source. Averages a column of one or more groups of records. Which records to include in the calculation can be based on a condition.
 
-### Syntax
+### AVG Syntax
 
-    SELECT AVG(aggregate_expression)
+    SELECT AVG([ALL | DISTINCT] aggregate_expression)
     FROM tables
     WHERE conditions;
 
     SELECT expression1, expression2, ... expression_n,
-           AVG(aggregate_expression)
+           AVG([ALL | DISTINCT] aggregate_expression)
     FROM tables
     WHERE conditions
     GROUP BY expression1, expression2, ... expression_n;
 
-Expressions listed within the AVG function and must be included in the GROUP BY clause.
-
-### Examples
-
-    SELECT AVG(salary) FROM cp.`employee.json`;
-    +---------------------+
-    |       EXPR$0        |
-    +---------------------+
-    | 4019.6017316017314  |
-    +---------------------+
-    1 row selected (0.221 seconds)
+Expressions listed within the AVG function and must be included in the GROUP BY clause. 
+
+### AVG Examples
+
+```
+ALTER SESSION SET `store.json.all_text_mode` = false;
++-------+------------------------------------+
+|  ok   |              summary               |
++-------+------------------------------------+
+| true  | store.json.all_text_mode updated.  |
++-------+------------------------------------+
+1 row selected (0.073 seconds)
+```
+
+Take a look at the salaries of employees having IDs 1139, 1140, and 1141. These are the salaries that subsequent examples will average and sum.
+
+```
+SELECT * FROM cp.`employee.json` WHERE employee_id IN (1139, 1140, 1141);
++--------------+------------------+-------------+------------+--------------+--------------------------+-----------+----------------+-------------+------------------------+-------------+----------------+------------------+-----------------+---------+-----------------------+
+| employee_id  |    full_name     | first_name  | last_name  | position_id  |      position_title      | store_id  | department_id  | birth_date  |       hire_date        |   salary    | supervisor_id  | education_level  | marital_status  | gender  |    management_role    |
++--------------+------------------+-------------+------------+--------------+--------------------------+-----------+----------------+-------------+------------------------+-------------+----------------+------------------+-----------------+---------+-----------------------+
+| 1139         | Jeanette Belsey  | Jeanette    | Belsey     | 12           | Store Assistant Manager  | 18        | 11             | 1972-05-12  | 1998-01-01 00:00:00.0  | 10000.0000  | 17             | Graduate Degree  | S               | M       | Store Management      |
+| 1140         | Mona Jaramillo   | Mona        | Jaramillo  | 13           | Store Shift Supervisor   | 18        | 11             | 1961-09-24  | 1998-01-01 00:00:00.0  | 8900.0000   | 1139           | Partial College  | S               | M       | Store Management      |
+| 1141         | James Compagno   | James       | Compagno   | 15           | Store Permanent Checker  | 18        | 15             | 1914-02-02  | 1998-01-01 00:00:00.0  | 6400.0000   | 1139           | Graduate Degree  | S               | M       | Store Full Time Staf  |
++--------------+------------------+-------------+------------+--------------+--------------------------+-----------+----------------+-------------+------------------------+-------------+----------------+------------------+-----------------+---------+-----------------------+
+3 rows selected (0.284 seconds)
+```
+
+```
+SELECT AVG(salary) FROM cp.`employee.json` WHERE employee_id IN (1139, 1140, 1141);
++--------------------+
+|       EXPR$0       |
++--------------------+
+| 8433.333333333334  |
++--------------------+
+1 row selected (0.208 seconds)
+
+SELECT AVG(ALL salary) FROM cp.`employee.json` WHERE employee_id IN (1139, 1140, 1141);
++--------------------+
+|       EXPR$0       |
++--------------------+
+| 8433.333333333334  |
++--------------------+
+1 row selected (0.17 seconds)
+
+SELECT AVG(DISTINCT salary) FROM cp.`employee.json`;
++---------------------+
+|       EXPR$0        |
++---------------------+
+| 12773.333333333334  |
++---------------------+
+1 row selected (0.384 seconds)
+```
 
     SELECT education_level, AVG(salary) FROM cp.`employee.json` GROUP BY education_level;
     +----------------------+---------------------+
@@ -61,86 +105,134 @@ Expressions listed within the AVG function and must be included in the GROUP BY
     +----------------------+---------------------+
     5 rows selected (0.495 seconds)
 
-## COUNT, MIN, MAX, and SUM
+## COUNT
+Returns the number of rows that match the given criteria.
+
+### COUNT Syntax
+
+    SELECT COUNT([DISTINCT | ALL] column) FROM . . .
+    SELECT COUNT(*) FROM . . .
+
+* column  
+  Returns the number of values of the specified column.  
+* DISTINCT column  
+  Returns the number of distinct values in the column.  
+* ALL column  
+  Returns the number of values of the specified column.  
+* * (asterisk)
+  Returns the number of records in the table.
+
+
+### COUNT Examples
+
+    SELECT COUNT(DISTINCT salary) FROM cp.`employee.json`;
+    +---------+
+    | EXPR$0  |
+    +---------+
+    | 48      |
+    +---------+
+    1 row selected (0.159 seconds)
+
+    SELECT COUNT(ALL salary) FROM cp.`employee.json`;
+    +---------+
+    | EXPR$0  |
+    +---------+
+    | 1155    |
+    +---------+
+    1 row selected (0.106 seconds)
+
+    SELECT COUNT(salary) FROM cp.`employee.json`;
+    +---------+
+    | EXPR$0  |
+    +---------+
+    | 1155    |
+    +---------+
+    1 row selected (0.102 seconds)
+
+    SELECT COUNT(*) FROM cp.`employee.json`;
+    +---------+
+    | EXPR$0  |
+    +---------+
+    | 1155    |
+    +---------+
+    1 row selected (0.174 seconds)
+
+## MIN and MAX Functions
+These functions return the smallest and largest values of the selected columns, respectively.
+
+### MIN and MAX Syntax
+
+MIN(column)  
+MAX(column)
+
+### MIN and MAX Examples
+
+```
+SELECT MIN(salary) FROM cp.`employee.json`;
++---------+
+| EXPR$0  |
++---------+
+| 20.0    |
++---------+
+1 row selected (0.138 seconds)
+
+SELECT MAX(salary) FROM cp.`employee.json`;
++----------+
+|  EXPR$0  |
++----------+
+| 80000.0  |
++----------+
+1 row selected (0.139 seconds)
+```
+
+Use a correlated subquery to find the names and salaries of the lowest paid employees:
+
+```
+SELECT full_name, SALARY FROM cp.`employee.json` WHERE salary = (SELECT MIN(salary) FROM cp.`employee.json`);
++------------------------+---------+
+|       full_name        | SALARY  |
++------------------------+---------+
+| Leopoldo Renfro        | 20.0    |
+| Donna Brockett         | 20.0    |
+| Laurie Anderson        | 20.0    |
+. . .
+```
+
+## SUM Function
+Returns the total of a numeric column.
+
+### SUM syntax
+
+SUM(column)
 
 ### Examples
 
-    SELECT a2 FROM t2;
-    +------------+
-    |     a2     |
-    +------------+
-    | 0          |
-    | 1          |
-    | 2          |
-    | 2          |
-    | 2          |
-    | 3          |
-    | 4          |
-    | 5          |
-    | 6          |
-    | 7          |
-    | 7          |
-    | 8          |
-    | 9          |
-    +------------+
-    13 rows selected (0.056 seconds)
-
-    SELECT AVG(ALL a2) FROM t2;
-    +--------------------+
-    |        EXPR$0      |
-    +--------------------+
-    | 4.3076923076923075 |
-    +--------------------+
-    1 row selected (0.084 seconds)
-
-    SELECT AVG(DISTINCT a2) FROM t2;
-    +------------+
-    |   EXPR$0   |
-    +------------+
-    | 4.5        |
-    +------------+
-    1 row selected (0.079 seconds)
-
-    SELECT SUM(ALL a2) FROM t2;
-    +------------+
-    |   EXPR$0   |
-    +------------+
-    | 56         |
-    +------------+
-    1 row selected (0.086 seconds)
-
-    SELECT SUM(DISTINCT a2) FROM t2;
-    +------------+
-    |   EXPR$0   |
-    +------------+
-    | 45         |
-    +------------+
-    1 row selected (0.078 seconds)
-
-    +------------+
-    |   EXPR$0   |
-    +------------+
-    | 13         |
-    +------------+
-    1 row selected (0.056 seconds)
-
-    SELECT COUNT(ALL a2) FROM t2;
-    +------------+
-    |   EXPR$0   |
-    +------------+
-    | 13         |
-    +------------+
-    1 row selected (0.056 seconds)
-
-    SELECT COUNT(DISTINCT a2) FROM t2;
-    +------------+
-    |   EXPR$0   |
-    +------------+
-    | 10         |
-    +------------+
-    1 row selected (0.074 seconds)
-  
-  
+```
+SELECT SUM(ALL salary) FROM cp.`employee.json`;
++------------+
+|   EXPR$0   |
++------------+
+| 4642640.0  |
++------------+
+1 row selected (0.123 seconds)
+
+SELECT SUM(DISTINCT salary) FROM cp.`employee.json`;
++-----------+
+|  EXPR$0   |
++-----------+
+| 613120.0  |
++-----------+
+1 row selected (0.309 seconds)
+
+SELECT SUM(salary) FROM cp.`employee.json` WHERE employee_id IN (1139, 1140, 1141);
++----------+
+|  EXPR$0  |
++----------+
+| 25300.0  |
++----------+
+1 row selected (1.995 seconds)
+```
+
 ## Aggregate Statistical Functions
 
 Drill provides following aggregate statistics functions:


Mime
View raw message