Return-Path: X-Original-To: apmail-drill-commits-archive@www.apache.org Delivered-To: apmail-drill-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E9F9E18160 for ; Mon, 23 Nov 2015 21:56:19 +0000 (UTC) Received: (qmail 37174 invoked by uid 500); 23 Nov 2015 21:56:19 -0000 Delivered-To: apmail-drill-commits-archive@drill.apache.org Received: (qmail 37135 invoked by uid 500); 23 Nov 2015 21:56:19 -0000 Mailing-List: contact commits-help@drill.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: commits@drill.apache.org Delivered-To: mailing list commits@drill.apache.org Received: (qmail 37121 invoked by uid 99); 23 Nov 2015 21:56:19 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 23 Nov 2015 21:56:19 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id 981CBE0A9B; Mon, 23 Nov 2015 21:56:19 +0000 (UTC) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit From: tshiran@apache.org To: commits@drill.apache.org Date: Mon, 23 Nov 2015 21:56:20 -0000 Message-Id: In-Reply-To: <5cfdb68ff4f6416fba635468216eb769@git.apache.org> References: <5cfdb68ff4f6416fba635468216eb769@git.apache.org> X-Mailer: ASF-Git Admin Mailer Subject: [02/15] drill-site git commit: 1.3 website update http://git-wip-us.apache.org/repos/asf/drill-site/blob/9090a50b/docs/sql-window-functions/index.html ---------------------------------------------------------------------- diff --git a/docs/sql-window-functions/index.html b/docs/sql-window-functions/index.html index 028212f..ba53a11 100644 --- a/docs/sql-window-functions/index.html +++ b/docs/sql-window-functions/index.html @@ -361,7 +361,7 @@ -
  • JDBC Storage Plugin
  • +
  • RDBMS Storage Plugin
  • @@ -372,6 +372,10 @@
  • MapR-DB Format
  • + +
  • S3 Storage Plugin
  • + + @@ -486,6 +490,8 @@
  • Querying Directories
  • +
  • Querying Sequence Files
  • + @@ -817,6 +823,10 @@
  • Text Files: CSV, TSV, PSV
  • + +
  • Sequence Files
  • + + @@ -864,6 +874,10 @@ @@ -864,6 +874,10 @@ @@ -864,6 +874,10 @@ @@ -864,6 +874,10 @@ @@ -864,6 +874,10 @@ @@ -864,6 +874,10 @@ @@ -864,6 +874,10 @@ @@ -864,6 +874,10 @@ @@ -864,6 +874,10 @@ @@ -864,6 +874,10 @@ @@ -864,6 +874,10 @@ @@ -864,6 +874,10 @@ @@ -864,6 +874,10 @@ @@ -864,6 +874,10 @@ @@ -864,6 +874,10 @@
      +
    • REST API
    • + + +
    • Develop Drill
      • @@ -910,6 +924,10 @@
          +
        • Apache Drill 1.3.0 Release Notes
        • + + +
        • Apache Drill 1.2.0 Release Notes
        • @@ -1021,15 +1039,18 @@

          Select Data from Particular Columns

          -

          Converting text files to another format, such as Parquet, using the CTAS command and a SELECT * statement is not recommended. Instead, select data from particular columns using the COLUMN[n] syntax, and then assign meaningful column -names using aliases. For example:

          +

          Converting text files to another format, such as Parquet, using the CTAS command and a SELECT * statement is not recommended. Instead, you should select data from particular columns. If your text file have no headers, use the COLUMN[n] syntax, and then assign meaningful column names using aliases. For example:

          CREATE TABLE parquet_users AS SELECT CAST(COLUMNS[0] AS INT) AS user_id,
           COLUMNS[1] AS username, CAST(COLUMNS[2] AS TIMESTAMP) AS registration_date
           FROM `users.csv1`;
           
          -

          You need to select particular columns instead of using SELECT * for performance reasons. Drill reads CSV, TSV, and PSV files into a list of -VARCHARS, rather than individual columns. While parquet supports and Drill reads lists, as of this release of Drill, the read path for complex data is not optimized.

          +

          You need to select particular columns instead of using SELECT * for performance reasons. Drill reads CSV, TSV, and PSV files into a list of VARCHARS, rather than individual columns. While parquet supports and Drill reads lists, as of this release of Drill, the read path for complex data is not optimized.

          +

          If your text file have headers, you can enable extractHeader and select particular columns by name. For example:

          +
          CREATE TABLE parquet_users AS SELECT CAST(user_id AS INT) AS user_id,
          +username, CAST(registration_date AS TIMESTAMP) AS registration_date
          +FROM `users.csv1`;
          +

          Cast data

          You can also improve performance by casting the VARCHAR data to INT, FLOAT, DATETIME, and so on when you read the data from a text file. Drill performs better reading fixed-width than reading VARCHAR data.

          @@ -1048,6 +1069,7 @@ VARCHARS, rather than individual columns. While parquet supports and Drill reads
        • delimiter
        • quote
        • skipFirstLine
        • +
        • extractHeader

        Set the sys.options property setting exec.storage.enable_new_text_reader to true (the default) before attempting to use these attributes.

        @@ -1072,8 +1094,16 @@ VARCHARS, rather than individual columns. While parquet supports and Drill reads

        The examples in this section show the results of querying CSV files that use and do not use a header, include comments, and use an escape character:

        -

        Using a Header in a File

        - +

        Not Using a Header in a File

        +
        "csv": {
        +  "type": "text",
        +  "extensions": [
        +    "csv2"
        +  ],
        +  "skipFirstLine": true,
        +  "delimiter": ","
        +},
        +

        CSV with header

        0: jdbc:drill:zk=local> SELECT * FROM dfs.`/tmp/csv_with_header.csv2`;
         +------------------------+
        @@ -1087,9 +1117,45 @@ VARCHARS, rather than individual columns. While parquet supports and Drill reads
         | ["hello","1","2","3"]  |
         | ["hello","1","2","3"]  |
         +------------------------+
        +7 rows selected (0.112 seconds)
        +
        +

        Using a Header in a File

        +
        "csv": {
        +  "type": "text",
        +  "extensions": [
        +    "csv2"
        +  ],
        +  "skipFirstLine": false,
        +  "extractHeader": true,
        +  "delimiter": ","
        +},
        +
        +

        CSV with header

        +
        0: jdbc:drill:zk=local> SELECT * FROM dfs.`/tmp/csv_with_header.csv2`;
        ++-------+------+------+------+
        +| name  | num1 | num2 | num3 |
        ++-------+------+------+------+
        +| hello |   1  |   2  |   3  |
        +| hello |   1  |   2  |   3  |
        +| hello |   1  |   2  |   3  |
        +| hello |   1  |   2  |   3  |
        +| hello |   1  |   2  |   3  |
        +| hello |   1  |   2  |   3  |
        +| hello |   1  |   2  |   3  |
        ++-------+------+------+------+
        +7 rows selected (0.12 seconds)
        +
        +

        File with no Header

        +
        "csv": {
        +  "type": "text",
        +  "extensions": [
        +    "csv"
        +  ],
        +  "skipFirstLine": false,
        +  "extractHeader": false,
        +  "delimiter": ","
        +},
         
        -

        Not Using a Header in a File

        -

        CSV no header

        0: jdbc:drill:zk=local> SELECT * FROM dfs.`/tmp/csv_no_header.csv`;
         +------------------------+
        @@ -1166,7 +1232,8 @@ VARCHARS, rather than individual columns. While parquet supports and Drill reads
             "csv"
           ],
           "comment": "&",
        -  "skipFirstLine": true,
        +  "skipFirstLine": false,
        +  "extractHeader": true,
           "delimiter": ","
         },
         
        @@ -1186,7 +1253,8 @@ VARCHARS, rather than individual columns. While parquet supports and Drill reads "csv2" ], "comment": "&", - "skipFirstLine": true, + "skipFirstLine": false, + "extractHeader": true, "delimiter": "," }, @@ -1259,7 +1327,7 @@ FROM dfs.tmp.`/stats/airport_data/*` @@ -1285,6 +1353,6 @@ m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m) ga('create', 'UA-53379651-1', 'auto'); ga('send', 'pageview'); - + http://git-wip-us.apache.org/repos/asf/drill-site/blob/9090a50b/docs/troubleshooting/index.html ---------------------------------------------------------------------- diff --git a/docs/troubleshooting/index.html b/docs/troubleshooting/index.html index eeaa3fe..53564af 100644 --- a/docs/troubleshooting/index.html +++ b/docs/troubleshooting/index.html @@ -361,7 +361,7 @@ -
      • JDBC Storage Plugin
      • +
      • RDBMS Storage Plugin
      • @@ -372,6 +372,10 @@
      • MapR-DB Format
      • + +
      • S3 Storage Plugin
      • + +
      @@ -486,6 +490,8 @@
    • Querying Directories
    • +
    • Querying Sequence Files
    • +
    @@ -817,6 +823,10 @@
  • Text Files: CSV, TSV, PSV
  • + +
  • Sequence Files
  • + + @@ -864,6 +874,10 @@ @@ -864,6 +874,10 @@
      +
    • REST API
    • + + +
    • Develop Drill
      • @@ -910,6 +924,10 @@
          +
        • Apache Drill 1.3.0 Release Notes
        • + + +
        • Apache Drill 1.2.0 Release Notes
        • @@ -1167,7 +1185,7 @@

          Even to a seasoned Java developer, the eval() method might look a bit strange because Drill generates the final code on the fly to fulfill a query request. This technique leverages Java’s just-in-time (JIT) compiler for maximum speed.

          -Basic Coding Rules +

          Basic Coding Rules

          To leverage Java’s just-in-time (JIT) compiler for maximum speed, you need to adhere to some basic rules.

          @@ -1201,6 +1219,10 @@ Basic Coding Rules </executions> </plugin> +

          Add a drill-module.conf File to Resources

          + +

          Add a drill-module.conf file in the resources folder of your project. The presence of this file tells Drill that your jar contains a custom function. If you have no specific configuration to set for your function, you can keep this file empty.

          +

          Build and Deploy the Function

          Build the function using mvn package:

          @@ -1218,10 +1240,6 @@ Basic Coding Rules

          <Drill installation directory>/jars/3rdparty

          -

          Add a drill-module.conf File to Resources

          - -

          Add a drill-module.conf file in the resources folder of your project. The presence of this file tells Drill that your jar contains a custom function. If you have no specific configuration to set for your function, you can keep this file empty.

          -

          Test the New Function

          Restart drill and run the following query on the employee.json file installed with Drill:

          @@ -1267,6 +1285,6 @@ m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m) ga('create', 'UA-53379651-1', 'auto'); ga('send', 'pageview'); - + http://git-wip-us.apache.org/repos/asf/drill-site/blob/9090a50b/docs/tutorials-introduction/index.html ---------------------------------------------------------------------- diff --git a/docs/tutorials-introduction/index.html b/docs/tutorials-introduction/index.html index 995fd9e..99f6ded 100644 --- a/docs/tutorials-introduction/index.html +++ b/docs/tutorials-introduction/index.html @@ -361,7 +361,7 @@ -
        • JDBC Storage Plugin
        • +
        • RDBMS Storage Plugin
        • @@ -372,6 +372,10 @@
        • MapR-DB Format
        • + +
        • S3 Storage Plugin
        • + +
        @@ -486,6 +490,8 @@
      • Querying Directories
      • +
      • Querying Sequence Files
      • +
      @@ -817,6 +823,10 @@
    • Text Files: CSV, TSV, PSV
    • + +
    • Sequence Files
    • + +
    @@ -864,6 +874,10 @@ @@ -864,6 +874,10 @@ @@ -864,6 +874,10 @@ @@ -864,6 +874,10 @@ @@ -864,6 +874,10 @@ @@ -864,6 +874,10 @@ @@ -864,6 +874,10 @@ @@ -864,6 +874,10 @@ @@ -864,6 +874,10 @@