impala-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jbap...@apache.org
Subject [34/51] [partial] incubator-impala git commit: IMPALA-3398: Add docs to main Impala branch.
Date Thu, 17 Nov 2016 23:12:12 GMT
http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3be0f122/docs/topics/impala_describe.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_describe.xml b/docs/topics/impala_describe.xml
new file mode 100644
index 0000000..c4070c3
--- /dev/null
+++ b/docs/topics/impala_describe.xml
@@ -0,0 +1,689 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
+<concept id="describe">
+
+  <title id="desc">DESCRIBE Statement</title>
+  <titlealts audience="PDF"><navtitle>DESCRIBE</navtitle></titlealts>
+  <prolog>
+    <metadata>
+      <data name="Category" value="Impala"/>
+      <data name="Category" value="Impala Data Types"/>
+      <data name="Category" value="SQL"/>
+      <data name="Category" value="Developers"/>
+      <data name="Category" value="Data Analysts"/>
+      <data name="Category" value="Tables"/>
+      <data name="Category" value="Reports"/>
+      <data name="Category" value="Schemas"/>
+    </metadata>
+  </prolog>
+
+  <conbody>
+
+    <p>
+      <indexterm audience="Cloudera">DESCRIBE statement</indexterm>
+      The <codeph>DESCRIBE</codeph> statement displays metadata about a table, such as the column names and their
+      data types.
+      <ph rev="2.3.0">In <keyword keyref="impala23_full"/> and higher, you can specify the name of a complex type column, which takes
+      the form of a dotted path. The path might include multiple components in the case of a nested type definition.</ph>
+      <ph rev="2.5.0">In <keyword keyref="impala25_full"/> and higher, the <codeph>DESCRIBE DATABASE</codeph> form can display
+      information about a database.</ph>
+    </p>
+
+    <p conref="../shared/impala_common.xml#common/syntax_blurb"/>
+
+<codeblock rev="2.5.0">DESCRIBE [DATABASE] [FORMATTED|EXTENDED] <varname>object_name</varname>
+
+object_name ::=
+    [<varname>db_name</varname>.]<varname>table_name</varname>[.<varname>complex_col_name</varname> ...]
+  | <varname>db_name</varname>
+</codeblock>
+
+    <p>
+      You can use the abbreviation <codeph>DESC</codeph> for the <codeph>DESCRIBE</codeph> statement.
+    </p>
+
+    <p rev="1.1">
+      The <codeph>DESCRIBE FORMATTED</codeph> variation displays additional information, in a format familiar to
+      users of Apache Hive. The extra information includes low-level details such as whether the table is internal
+      or external, when it was created, the file format, the location of the data in HDFS, whether the object is a
+      table or a view, and (for views) the text of the query from the view definition.
+    </p>
+
+    <note>
+      The <codeph>Compressed</codeph> field is not a reliable indicator of whether the table contains compressed
+      data. It typically always shows <codeph>No</codeph>, because the compression settings only apply during the
+      session that loads data and are not stored persistently with the table metadata.
+    </note>
+
+<p rev="2.5.0 IMPALA-2196">
+  <b>Describing databases:</b>
+</p>
+
+<p rev="2.5.0">
+  By default, the <codeph>DESCRIBE</codeph> output for a database includes the location
+  and the comment, which can be set by the <codeph>LOCATION</codeph> and <codeph>COMMENT</codeph>
+  clauses on the <codeph>CREATE DATABASE</codeph> statement.
+</p>
+
+<p rev="2.5.0">
+  The additional information displayed by the <codeph>FORMATTED</codeph> or <codeph>EXTENDED</codeph>
+  keyword includes the HDFS user ID that is considered the owner of the database, and any
+  optional database properties. The properties could be specified by the <codeph>WITH DBPROPERTIES</codeph>
+  clause if the database is created using a Hive <codeph>CREATE DATABASE</codeph> statement.
+  Impala currently does not set or do any special processing based on those properties.
+</p>
+
+<p rev="2.5.0">
+The following examples show the variations in syntax and output for
+describing databases. This feature is available in <keyword keyref="impala25_full"/>
+and higher.
+</p>
+
+<codeblock rev="2.5.0">
+describe database default;
++---------+----------------------+-----------------------+
+| name    | location             | comment               |
++---------+----------------------+-----------------------+
+| default | /user/hive/warehouse | Default Hive database |
++---------+----------------------+-----------------------+
+
+describe database formatted default;
++---------+----------------------+-----------------------+
+| name    | location             | comment               |
++---------+----------------------+-----------------------+
+| default | /user/hive/warehouse | Default Hive database |
+| Owner:  |                      |                       |
+|         | public               | ROLE                  |
++---------+----------------------+-----------------------+
+
+describe database extended default;
++---------+----------------------+-----------------------+
+| name    | location             | comment               |
++---------+----------------------+-----------------------+
+| default | /user/hive/warehouse | Default Hive database |
+| Owner:  |                      |                       |
+|         | public               | ROLE                  |
++---------+----------------------+-----------------------+
+</codeblock>
+
+<p>
+  <b>Describing tables:</b>
+</p>
+
+<p>
+  If the <codeph>DATABASE</codeph> keyword is omitted, the default
+  for the <codeph>DESCRIBE</codeph> statement is to refer to a table.
+</p>
+
+<codeblock>
+-- By default, the table is assumed to be in the current database.
+describe my_table;
++------+--------+---------+
+| name | type   | comment |
++------+--------+---------+
+| x    | int    |         |
+| s    | string |         |
++------+--------+---------+
+
+-- Use a fully qualified table name to specify a table in any database.
+describe my_database.my_table;
++------+--------+---------+
+| name | type   | comment |
++------+--------+---------+
+| x    | int    |         |
+| s    | string |         |
++------+--------+---------+
+
+-- The formatted or extended output includes additional useful information.
+-- The LOCATION field is especially useful to know for DDL statements and HDFS commands
+-- during ETL jobs. (The LOCATION includes a full hdfs:// URL, omitted here for readability.)
+describe formatted my_table;
++------------------------------+----------------------------------------------+----------------------+
+| name                         | type                                         | comment              |
++------------------------------+----------------------------------------------+----------------------+
+| # col_name                   | data_type                                    | comment              |
+|                              | NULL                                         | NULL                 |
+| x                            | int                                          | NULL                 |
+| s                            | string                                       | NULL                 |
+|                              | NULL                                         | NULL                 |
+| # Detailed Table Information | NULL                                         | NULL                 |
+| Database:                    | my_database                                  | NULL                 |
+| Owner:                       | jrussell                                     | NULL                 |
+| CreateTime:                  | Fri Mar 18 15:58:00 PDT 2016                 | NULL                 |
+| LastAccessTime:              | UNKNOWN                                      | NULL                 |
+| Protect Mode:                | None                                         | NULL                 |
+| Retention:                   | 0                                            | NULL                 |
+| Location:                    | /user/hive/warehouse/my_database.db/my_table | NULL                 |
+| Table Type:                  | MANAGED_TABLE                                | NULL                 |
+| Table Parameters:            | NULL                                         | NULL                 |
+|                              | transient_lastDdlTime                        | 1458341880           |
+|                              | NULL                                         | NULL                 |
+| # Storage Information        | NULL                                         | NULL                 |
+| SerDe Library:               | org. ... .LazySimpleSerDe                    | NULL                 |
+| InputFormat:                 | org.apache.hadoop.mapred.TextInputFormat     | NULL                 |
+| OutputFormat:                | org. ... .HiveIgnoreKeyTextOutputFormat      | NULL                 |
+| Compressed:                  | No                                           | NULL                 |
+| Num Buckets:                 | 0                                            | NULL                 |
+| Bucket Columns:              | []                                           | NULL                 |
+| Sort Columns:                | []                                           | NULL                 |
++------------------------------+----------------------------------------------+----------------------+
+</codeblock>
+
+    <p conref="../shared/impala_common.xml#common/complex_types_blurb"/>
+
+    <p rev="2.3.0">
+      Because the column definitions for complex types can become long, particularly when such types are nested,
+      the <codeph>DESCRIBE</codeph> statement uses special formatting for complex type columns to make the output readable.
+    </p>
+
+    <p rev="2.3.0">
+      For the <codeph>ARRAY</codeph>, <codeph>STRUCT</codeph>, and <codeph>MAP</codeph> types available in
+      <keyword keyref="impala23_full"/> and higher, the <codeph>DESCRIBE</codeph> output is formatted to avoid
+      excessively long lines for multiple fields within a <codeph>STRUCT</codeph>, or a nested sequence of
+      complex types.
+    </p>
+
+    <p conref="../shared/impala_common.xml#common/complex_types_describe"/>
+
+    <p rev="2.3.0">
+      For example, here is the <codeph>DESCRIBE</codeph> output for a table containing a single top-level column
+      of each complex type:
+    </p>
+
+<codeblock rev="2.3.0"><![CDATA[create table t1 (x int, a array<int>, s struct<f1: string, f2: bigint>, m map<string,int>) stored as parquet;
+
+describe t1;
++------+-----------------+---------+
+| name | type            | comment |
++------+-----------------+---------+
+| x    | int             |         |
+| a    | array<int>      |         |
+| s    | struct<         |         |
+|      |   f1:string,    |         |
+|      |   f2:bigint     |         |
+|      | >               |         |
+| m    | map<string,int> |         |
++------+-----------------+---------+
+]]>
+</codeblock>
+
+    <p rev="2.3.0">
+      Here are examples showing how to <q>drill down</q> into the layouts of complex types, including
+      using multi-part names to examine the definitions of nested types.
+      The <codeph>&lt; &gt;</codeph> delimiters identify the columns with complex types;
+      these are the columns where you can descend another level to see the parts that make up
+      the complex type.
+      This technique helps you to understand the multi-part names you use as table references in queries
+      involving complex types, and the corresponding column names you refer to in the <codeph>SELECT</codeph> list.
+      These tables are from the <q>nested TPC-H</q> schema, shown in detail in
+      <xref href="impala_complex_types.xml#complex_sample_schema"/>.
+    </p>
+
+    <p>
+      The <codeph>REGION</codeph> table contains an <codeph>ARRAY</codeph> of <codeph>STRUCT</codeph>
+      elements:
+    </p>
+
+    <ul>
+      <li>
+        <p>
+          The first <codeph>DESCRIBE</codeph> specifies the table name, to display the definition
+          of each top-level column.
+        </p>
+      </li>
+      <li>
+        <p>
+          The second <codeph>DESCRIBE</codeph> specifies the name of a complex
+          column, <codeph>REGION.R_NATIONS</codeph>, showing that when you include the name of an <codeph>ARRAY</codeph>
+          column in a <codeph>FROM</codeph> clause, that table reference acts like a two-column table with
+          columns <codeph>ITEM</codeph> and <codeph>POS</codeph>.
+        </p>
+      </li>
+      <li>
+        <p>
+          The final <codeph>DESCRIBE</codeph> specifies the fully qualified name of the <codeph>ITEM</codeph> field,
+          to display the layout of its underlying <codeph>STRUCT</codeph> type in table format, with the fields
+          mapped to column names.
+        </p>
+      </li>
+    </ul>
+
+<codeblock rev="2.3.0"><![CDATA[
+-- #1: The overall layout of the entire table.
+describe region;
++-------------+-------------------------+---------+
+| name        | type                    | comment |
++-------------+-------------------------+---------+
+| r_regionkey | smallint                |         |
+| r_name      | string                  |         |
+| r_comment   | string                  |         |
+| r_nations   | array<struct<           |         |
+|             |   n_nationkey:smallint, |         |
+|             |   n_name:string,        |         |
+|             |   n_comment:string      |         |
+|             | >>                      |         |
++-------------+-------------------------+---------+
+
+-- #2: The ARRAY column within the table.
+describe region.r_nations;
++------+-------------------------+---------+
+| name | type                    | comment |
++------+-------------------------+---------+
+| item | struct<                 |         |
+|      |   n_nationkey:smallint, |         |
+|      |   n_name:string,        |         |
+|      |   n_comment:string      |         |
+|      | >                       |         |
+| pos  | bigint                  |         |
++------+-------------------------+---------+
+
+-- #3: The STRUCT that makes up each ARRAY element.
+--     The fields of the STRUCT act like columns of a table.
+describe region.r_nations.item;
++-------------+----------+---------+
+| name        | type     | comment |
++-------------+----------+---------+
+| n_nationkey | smallint |         |
+| n_name      | string   |         |
+| n_comment   | string   |         |
++-------------+----------+---------+
+]]>
+</codeblock>
+
+    <p>
+      The <codeph>CUSTOMER</codeph> table contains an <codeph>ARRAY</codeph> of <codeph>STRUCT</codeph>
+      elements, where one field in the <codeph>STRUCT</codeph> is another <codeph>ARRAY</codeph> of
+      <codeph>STRUCT</codeph> elements:
+    </p>
+    <ul>
+      <li>
+        <p>
+          Again, the initial <codeph>DESCRIBE</codeph> specifies only the table name.
+        </p>
+      </li>
+      <li>
+        <p>
+          The second <codeph>DESCRIBE</codeph> specifies the qualified name of the complex
+          column, <codeph>CUSTOMER.C_ORDERS</codeph>, showing how an <codeph>ARRAY</codeph>
+          is represented as a two-column table with columns <codeph>ITEM</codeph> and <codeph>POS</codeph>.
+        </p>
+      </li>
+      <li>
+        <p>
+          The third <codeph>DESCRIBE</codeph> specifies the qualified name of the <codeph>ITEM</codeph>
+          of the <codeph>ARRAY</codeph> column, to see the structure of the nested <codeph>ARRAY</codeph>.
+          Again, it has has two parts, <codeph>ITEM</codeph> and <codeph>POS</codeph>. Because the
+          <codeph>ARRAY</codeph> contains a <codeph>STRUCT</codeph>, the layout of the <codeph>STRUCT</codeph>
+          is shown.
+        </p>
+      </li>
+      <li>
+        <p>
+          The fourth and fifth <codeph>DESCRIBE</codeph> statements drill down into a <codeph>STRUCT</codeph> field that
+          is itself a complex type, an <codeph>ARRAY</codeph> of <codeph>STRUCT</codeph>.
+          The <codeph>ITEM</codeph> portion of the qualified name is only required when the <codeph>ARRAY</codeph>
+          elements are anonymous. The fields of the <codeph>STRUCT</codeph> give names to any other complex types
+          nested inside the <codeph>STRUCT</codeph>. Therefore, the <codeph>DESCRIBE</codeph> parameters
+          <codeph>CUSTOMER.C_ORDERS.ITEM.O_LINEITEMS</codeph> and <codeph>CUSTOMER.C_ORDERS.O_LINEITEMS</codeph>
+          are equivalent. (For brevity, leave out the <codeph>ITEM</codeph> portion of
+          a qualified name when it is not required.)
+        </p>
+      </li>
+      <li>
+        <p>
+          The final <codeph>DESCRIBE</codeph> shows the layout of the deeply nested <codeph>STRUCT</codeph> type.
+          Because there are no more complex types nested inside this <codeph>STRUCT</codeph>, this is as far
+          as you can drill down into the layout for this table.
+        </p>
+      </li>
+    </ul>
+
+<codeblock rev="2.3.0"><![CDATA[-- #1: The overall layout of the entire table.
+describe customer;
++--------------+------------------------------------+
+| name         | type                               |
++--------------+------------------------------------+
+| c_custkey    | bigint                             |
+... more scalar columns ...
+| c_orders     | array<struct<                      |
+|              |   o_orderkey:bigint,               |
+|              |   o_orderstatus:string,            |
+|              |   o_totalprice:decimal(12,2),      |
+|              |   o_orderdate:string,              |
+|              |   o_orderpriority:string,          |
+|              |   o_clerk:string,                  |
+|              |   o_shippriority:int,              |
+|              |   o_comment:string,                |
+|              |   o_lineitems:array<struct<        |
+|              |     l_partkey:bigint,              |
+|              |     l_suppkey:bigint,              |
+|              |     l_linenumber:int,              |
+|              |     l_quantity:decimal(12,2),      |
+|              |     l_extendedprice:decimal(12,2), |
+|              |     l_discount:decimal(12,2),      |
+|              |     l_tax:decimal(12,2),           |
+|              |     l_returnflag:string,           |
+|              |     l_linestatus:string,           |
+|              |     l_shipdate:string,             |
+|              |     l_commitdate:string,           |
+|              |     l_receiptdate:string,          |
+|              |     l_shipinstruct:string,         |
+|              |     l_shipmode:string,             |
+|              |     l_comment:string               |
+|              |   >>                               |
+|              | >>                                 |
++--------------+------------------------------------+
+
+-- #2: The ARRAY column within the table.
+describe customer.c_orders;
++------+------------------------------------+
+| name | type                               |
++------+------------------------------------+
+| item | struct<                            |
+|      |   o_orderkey:bigint,               |
+|      |   o_orderstatus:string,            |
+... more struct fields ...
+|      |   o_lineitems:array<struct<        |
+|      |     l_partkey:bigint,              |
+|      |     l_suppkey:bigint,              |
+... more nested struct fields ...
+|      |     l_comment:string               |
+|      |   >>                               |
+|      | >                                  |
+| pos  | bigint                             |
++------+------------------------------------+
+
+-- #3: The STRUCT that makes up each ARRAY element.
+--     The fields of the STRUCT act like columns of a table.
+describe customer.c_orders.item;
++-----------------+----------------------------------+
+| name            | type                             |
++-----------------+----------------------------------+
+| o_orderkey      | bigint                           |
+| o_orderstatus   | string                           |
+| o_totalprice    | decimal(12,2)                    |
+| o_orderdate     | string                           |
+| o_orderpriority | string                           |
+| o_clerk         | string                           |
+| o_shippriority  | int                              |
+| o_comment       | string                           |
+| o_lineitems     | array<struct<                    |
+|                 |   l_partkey:bigint,              |
+|                 |   l_suppkey:bigint,              |
+... more struct fields ...
+|                 |   l_comment:string               |
+|                 | >>                               |
++-----------------+----------------------------------+
+
+-- #4: The ARRAY nested inside the STRUCT elements of the first ARRAY.
+describe customer.c_orders.item.o_lineitems;
++------+----------------------------------+
+| name | type                             |
++------+----------------------------------+
+| item | struct<                          |
+|      |   l_partkey:bigint,              |
+|      |   l_suppkey:bigint,              |
+... more struct fields ...
+|      |   l_comment:string               |
+|      | >                                |
+| pos  | bigint                           |
++------+----------------------------------+
+
+-- #5: Shorter form of the previous DESCRIBE. Omits the .ITEM portion of the name
+--     because O_LINEITEMS and other field names provide a way to refer to things
+--     inside the ARRAY element.
+describe customer.c_orders.o_lineitems;
++------+----------------------------------+
+| name | type                             |
++------+----------------------------------+
+| item | struct<                          |
+|      |   l_partkey:bigint,              |
+|      |   l_suppkey:bigint,              |
+... more struct fields ...
+|      |   l_comment:string               |
+|      | >                                |
+| pos  | bigint                           |
++------+----------------------------------+
+
+-- #6: The STRUCT representing ARRAY elements nested inside
+--     another ARRAY of STRUCTs. The lack of any complex types
+--     in this output means this is as far as DESCRIBE can
+--     descend into the table layout.
+describe customer.c_orders.o_lineitems.item;
++-----------------+---------------+
+| name            | type          |
++-----------------+---------------+
+| l_partkey       | bigint        |
+| l_suppkey       | bigint        |
+... more scalar columns ...
+| l_comment       | string        |
++-----------------+---------------+
+]]>
+</codeblock>
+
+<p conref="../shared/impala_common.xml#common/usage_notes_blurb"/>
+
+<p>
+  After the <cmdname>impalad</cmdname> daemons are restarted, the first query against a table can take longer
+  than subsequent queries, because the metadata for the table is loaded before the query is processed. This
+  one-time delay for each table can cause misleading results in benchmark tests or cause unnecessary concern.
+  To <q>warm up</q> the Impala metadata cache, you can issue a <codeph>DESCRIBE</codeph> statement in advance
+  for each table you intend to access later.
+</p>
+
+<p>
+  When you are dealing with data files stored in HDFS, sometimes it is important to know details such as the
+  path of the data files for an Impala table, and the hostname for the namenode. You can get this information
+  from the <codeph>DESCRIBE FORMATTED</codeph> output. You specify HDFS URIs or path specifications with
+  statements such as <codeph>LOAD DATA</codeph> and the <codeph>LOCATION</codeph> clause of <codeph>CREATE
+  TABLE</codeph> or <codeph>ALTER TABLE</codeph>. You might also use HDFS URIs or paths with Linux commands
+  such as <cmdname>hadoop</cmdname> and <cmdname>hdfs</cmdname> to copy, rename, and so on, data files in HDFS.
+</p>
+
+<p conref="../shared/impala_common.xml#common/sync_ddl_blurb"/>
+
+<p rev="1.2.1">
+  Each table can also have associated table statistics and column statistics. To see these categories of
+  information, use the <codeph>SHOW TABLE STATS <varname>table_name</varname></codeph> and <codeph>SHOW COLUMN
+  STATS <varname>table_name</varname></codeph> statements.
+<!--
+For example, the table statistics can often show you the number
+and total size of the files in the table, even if you have not
+run <codeph>COMPUTE STATS</codeph>.
+-->
+  See <xref href="impala_show.xml#show"/> for details.
+</p>
+
+<note conref="../shared/impala_common.xml#common/compute_stats_next"/>
+
+<p conref="../shared/impala_common.xml#common/example_blurb"/>
+
+<p>
+  The following example shows the results of both a standard <codeph>DESCRIBE</codeph> and <codeph>DESCRIBE
+  FORMATTED</codeph> for different kinds of schema objects:
+</p>
+
+  <ul>
+    <li>
+      <codeph>DESCRIBE</codeph> for a table or a view returns the name, type, and comment for each of the
+      columns. For a view, if the column value is computed by an expression, the column name is automatically
+      generated as <codeph>_c0</codeph>, <codeph>_c1</codeph>, and so on depending on the ordinal number of the
+      column.
+    </li>
+
+    <li>
+      A table created with no special format or storage clauses is designated as a <codeph>MANAGED_TABLE</codeph>
+      (an <q>internal table</q> in Impala terminology). Its data files are stored in an HDFS directory under the
+      default Hive data directory. By default, it uses Text data format.
+    </li>
+
+    <li>
+      A view is designated as <codeph>VIRTUAL_VIEW</codeph> in <codeph>DESCRIBE FORMATTED</codeph> output. Some
+      of its properties are <codeph>NULL</codeph> or blank because they are inherited from the base table. The
+      text of the query that defines the view is part of the <codeph>DESCRIBE FORMATTED</codeph> output.
+    </li>
+
+    <li>
+      A table with additional clauses in the <codeph>CREATE TABLE</codeph> statement has differences in
+      <codeph>DESCRIBE FORMATTED</codeph> output. The output for <codeph>T2</codeph> includes the
+      <codeph>EXTERNAL_TABLE</codeph> keyword because of the <codeph>CREATE EXTERNAL TABLE</codeph> syntax, and
+      different <codeph>InputFormat</codeph> and <codeph>OutputFormat</codeph> fields to reflect the Parquet file
+      format.
+    </li>
+  </ul>
+
+<codeblock>[localhost:21000] &gt; create table t1 (x int, y int, s string);
+Query: create table t1 (x int, y int, s string)
+[localhost:21000] &gt; describe t1;
+Query: describe t1
+Query finished, fetching results ...
++------+--------+---------+
+| name | type   | comment |
++------+--------+---------+
+| x    | int    |         |
+| y    | int    |         |
+| s    | string |         |
++------+--------+---------+
+Returned 3 row(s) in 0.13s
+[localhost:21000] &gt; describe formatted t1;
+Query: describe formatted t1
+Query finished, fetching results ...
++------------------------------+--------------------------------------------+------------+
+| name                         | type                                       | comment    |
++------------------------------+--------------------------------------------+------------+
+| # col_name                   | data_type                                  | comment    |
+|                              | NULL                                       | NULL       |
+| x                            | int                                        | None       |
+| y                            | int                                        | None       |
+| s                            | string                                     | None       |
+|                              | NULL                                       | NULL       |
+| # Detailed Table Information | NULL                                       | NULL       |
+| Database:                    | describe_formatted                         | NULL       |
+| Owner:                       | cloudera                                   | NULL       |
+| CreateTime:                  | Mon Jul 22 17:03:16 EDT 2013               | NULL       |
+| LastAccessTime:              | UNKNOWN                                    | NULL       |
+| Protect Mode:                | None                                       | NULL       |
+| Retention:                   | 0                                          | NULL       |
+| Location:                    | hdfs://127.0.0.1:8020/user/hive/warehouse/ |            |
+|                              |   describe_formatted.db/t1                 | NULL       |
+| Table Type:                  | MANAGED_TABLE                              | NULL       |
+| Table Parameters:            | NULL                                       | NULL       |
+|                              | transient_lastDdlTime                      | 1374526996 |
+|                              | NULL                                       | NULL       |
+| # Storage Information        | NULL                                       | NULL       |
+| SerDe Library:               | org.apache.hadoop.hive.serde2.lazy.        |            |
+|                              |   LazySimpleSerDe                          | NULL       |
+| InputFormat:                 | org.apache.hadoop.mapred.TextInputFormat   | NULL       |
+| OutputFormat:                | org.apache.hadoop.hive.ql.io.              |            |
+|                              |   HiveIgnoreKeyTextOutputFormat            | NULL       |
+| Compressed:                  | No                                         | NULL       |
+| Num Buckets:                 | 0                                          | NULL       |
+| Bucket Columns:              | []                                         | NULL       |
+| Sort Columns:                | []                                         | NULL       |
++------------------------------+--------------------------------------------+------------+
+Returned 26 row(s) in 0.03s
+[localhost:21000] &gt; create view v1 as select x, upper(s) from t1;
+Query: create view v1 as select x, upper(s) from t1
+[localhost:21000] &gt; describe v1;
+Query: describe v1
+Query finished, fetching results ...
++------+--------+---------+
+| name | type   | comment |
++------+--------+---------+
+| x    | int    |         |
+| _c1  | string |         |
++------+--------+---------+
+Returned 2 row(s) in 0.10s
+[localhost:21000] &gt; describe formatted v1;
+Query: describe formatted v1
+Query finished, fetching results ...
++------------------------------+------------------------------+----------------------+
+| name                         | type                         | comment              |
++------------------------------+------------------------------+----------------------+
+| # col_name                   | data_type                    | comment              |
+|                              | NULL                         | NULL                 |
+| x                            | int                          | None                 |
+| _c1                          | string                       | None                 |
+|                              | NULL                         | NULL                 |
+| # Detailed Table Information | NULL                         | NULL                 |
+| Database:                    | describe_formatted           | NULL                 |
+| Owner:                       | cloudera                     | NULL                 |
+| CreateTime:                  | Mon Jul 22 16:56:38 EDT 2013 | NULL                 |
+| LastAccessTime:              | UNKNOWN                      | NULL                 |
+| Protect Mode:                | None                         | NULL                 |
+| Retention:                   | 0                            | NULL                 |
+| Table Type:                  | VIRTUAL_VIEW                 | NULL                 |
+| Table Parameters:            | NULL                         | NULL                 |
+|                              | transient_lastDdlTime        | 1374526598           |
+|                              | NULL                         | NULL                 |
+| # Storage Information        | NULL                         | NULL                 |
+| SerDe Library:               | null                         | NULL                 |
+| InputFormat:                 | null                         | NULL                 |
+| OutputFormat:                | null                         | NULL                 |
+| Compressed:                  | No                           | NULL                 |
+| Num Buckets:                 | 0                            | NULL                 |
+| Bucket Columns:              | []                           | NULL                 |
+| Sort Columns:                | []                           | NULL                 |
+|                              | NULL                         | NULL                 |
+| # View Information           | NULL                         | NULL                 |
+| View Original Text:          | SELECT x, upper(s) FROM t1   | NULL                 |
+| View Expanded Text:          | SELECT x, upper(s) FROM t1   | NULL                 |
++------------------------------+------------------------------+----------------------+
+Returned 28 row(s) in 0.03s
+[localhost:21000] &gt; create external table t2 (x int, y int, s string) stored as parquet location '/user/cloudera/sample_data';
+[localhost:21000] &gt; describe formatted t2;
+Query: describe formatted t2
+Query finished, fetching results ...
++------------------------------+----------------------------------------------------+------------+
+| name                         | type                                               | comment    |
++------------------------------+----------------------------------------------------+------------+
+| # col_name                   | data_type                                          | comment    |
+|                              | NULL                                               | NULL       |
+| x                            | int                                                | None       |
+| y                            | int                                                | None       |
+| s                            | string                                             | None       |
+|                              | NULL                                               | NULL       |
+| # Detailed Table Information | NULL                                               | NULL       |
+| Database:                    | describe_formatted                                 | NULL       |
+| Owner:                       | cloudera                                           | NULL       |
+| CreateTime:                  | Mon Jul 22 17:01:47 EDT 2013                       | NULL       |
+| LastAccessTime:              | UNKNOWN                                            | NULL       |
+| Protect Mode:                | None                                               | NULL       |
+| Retention:                   | 0                                                  | NULL       |
+| Location:                    | hdfs://127.0.0.1:8020/user/cloudera/sample_data    | NULL       |
+| Table Type:                  | EXTERNAL_TABLE                                     | NULL       |
+| Table Parameters:            | NULL                                               | NULL       |
+|                              | EXTERNAL                                           | TRUE       |
+|                              | transient_lastDdlTime                              | 1374526907 |
+|                              | NULL                                               | NULL       |
+| # Storage Information        | NULL                                               | NULL       |
+| SerDe Library:               | org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe | NULL       |
+| InputFormat:                 | com.cloudera.impala.hive.serde.ParquetInputFormat  | NULL       |
+| OutputFormat:                | com.cloudera.impala.hive.serde.ParquetOutputFormat | NULL       |
+| Compressed:                  | No                                                 | NULL       |
+| Num Buckets:                 | 0                                                  | NULL       |
+| Bucket Columns:              | []                                                 | NULL       |
+| Sort Columns:                | []                                                 | NULL       |
++------------------------------+----------------------------------------------------+------------+
+Returned 27 row(s) in 0.17s</codeblock>
+
+    <p conref="../shared/impala_common.xml#common/cancel_blurb_no"/>
+
+    <p conref="../shared/impala_common.xml#common/permissions_blurb"/>
+    <p rev="CDH-19187">
+      The user ID that the <cmdname>impalad</cmdname> daemon runs under,
+      typically the <codeph>impala</codeph> user, must have read and execute
+      permissions for all directories that are part of the table.
+      (A table could span multiple different HDFS directories if it is partitioned.
+      The directories could be widely scattered because a partition can reside
+      in an arbitrary HDFS directory based on its <codeph>LOCATION</codeph> attribute.)
+    </p>
+
+    <p conref="../shared/impala_common.xml#common/related_info"/>
+
+    <p>
+      <xref href="impala_tables.xml#tables"/>, <xref href="impala_create_table.xml#create_table"/>,
+      <xref href="impala_show.xml#show_tables"/>, <xref href="impala_show.xml#show_create_table"/>
+    </p>
+  </conbody>
+</concept>

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3be0f122/docs/topics/impala_development.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_development.xml b/docs/topics/impala_development.xml
new file mode 100644
index 0000000..1b35046
--- /dev/null
+++ b/docs/topics/impala_development.xml
@@ -0,0 +1,231 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
+<concept id="intro_dev">
+
+  <title>Developing Impala Applications</title>
+  <titlealts audience="PDF"><navtitle>Developing Applications</navtitle></titlealts>
+  <prolog>
+    <metadata>
+      <data name="Category" value="Impala"/>
+      <data name="Category" value="SQL"/>
+      <data name="Category" value="Developers"/>
+      <data name="Category" value="Data Analysts"/>
+      <data name="Category" value="Concepts"/>
+    </metadata>
+  </prolog>
+
+  <conbody>
+
+    <p>
+      The core development language with Impala is SQL. You can also use Java or other languages to interact with
+      Impala through the standard JDBC and ODBC interfaces used by many business intelligence tools. For
+      specialized kinds of analysis, you can supplement the SQL built-in functions by writing
+      <xref href="impala_udf.xml#udfs">user-defined functions (UDFs)</xref> in C++ or Java.
+    </p>
+
+    <p outputclass="toc inpage"/>
+  </conbody>
+
+  <concept id="intro_sql">
+
+    <title>Overview of the Impala SQL Dialect</title>
+  <prolog>
+    <metadata>
+      <data name="Category" value="SQL"/>
+      <data name="Category" value="Concepts"/>
+    </metadata>
+  </prolog>
+
+    <conbody>
+
+      <p>
+        The Impala SQL dialect is highly compatible with the SQL syntax used in the Apache Hive component (HiveQL). As
+        such, it is familiar to users who are already familiar with running SQL queries on the Hadoop
+        infrastructure. Currently, Impala SQL supports a subset of HiveQL statements, data types, and built-in
+        functions. Impala also includes additional built-in functions for common industry features, to simplify
+        porting SQL from non-Hadoop systems.
+      </p>
+
+      <p>
+        For users coming to Impala from traditional database or data warehousing backgrounds, the following aspects of the SQL dialect
+        might seem familiar:
+      </p>
+
+      <ul>
+        <li>
+          <p>
+            The <xref href="impala_select.xml#select">SELECT statement</xref> includes familiar clauses such as <codeph>WHERE</codeph>,
+            <codeph>GROUP BY</codeph>, <codeph>ORDER BY</codeph>, and <codeph>WITH</codeph>.
+            You will find familiar notions such as
+            <xref href="impala_joins.xml#joins">joins</xref>, <xref href="impala_functions.xml#builtins">built-in
+            functions</xref> for processing strings, numbers, and dates,
+            <xref href="impala_aggregate_functions.xml#aggregate_functions">aggregate functions</xref>,
+            <xref href="impala_subqueries.xml#subqueries">subqueries</xref>, and
+            <xref href="impala_operators.xml#comparison_operators">comparison operators</xref>
+            such as <codeph>IN()</codeph> and <codeph>BETWEEN</codeph>.
+            The <codeph>SELECT</codeph> statement is the place where SQL standards compliance is most important.
+          </p>
+        </li>
+
+        <li>
+          <p>
+          From the data warehousing world, you will recognize the notion of
+          <xref href="impala_partitioning.xml#partitioning">partitioned tables</xref>.
+          One or more columns serve as partition keys, and the data is physically arranged so that
+          queries that refer to the partition key columns in the <codeph>WHERE</codeph> clause
+          can skip partitions that do not match the filter conditions. For example, if you have 10
+          years worth of data and use a clause such as <codeph>WHERE year = 2015</codeph>,
+          <codeph>WHERE year &gt; 2010</codeph>, or <codeph>WHERE year IN (2014, 2015)</codeph>,
+          Impala skips all the data for non-matching years, greatly reducing the amount of I/O
+          for the query.
+          </p>
+        </li>
+
+        <li rev="1.2">
+          <p>
+          In Impala 1.2 and higher, <xref href="impala_udf.xml#udfs">UDFs</xref> let you perform custom comparisons
+          and transformation logic during <codeph>SELECT</codeph> and <codeph>INSERT...SELECT</codeph> statements.
+          </p>
+        </li>
+      </ul>
+
+      <p>
+        For users coming to Impala from traditional database or data warehousing backgrounds, the following aspects of the SQL dialect
+        might require some learning and practice for you to become proficient in the Hadoop environment:
+      </p>
+
+      <ul>
+        <li>
+          <p>
+          Impala SQL is focused on queries and includes relatively little DML. There is no <codeph>UPDATE</codeph>
+          or <codeph>DELETE</codeph> statement. Stale data is typically discarded (by <codeph>DROP TABLE</codeph>
+          or <codeph>ALTER TABLE ... DROP PARTITION</codeph> statements) or replaced (by <codeph>INSERT
+          OVERWRITE</codeph> statements).
+          </p>
+        </li>
+
+        <li>
+          <p>
+          All data creation is done by <codeph>INSERT</codeph> statements, which typically insert data in bulk by
+          querying from other tables. There are two variations, <codeph>INSERT INTO</codeph> which appends to the
+          existing data, and <codeph>INSERT OVERWRITE</codeph> which replaces the entire contents of a table or
+          partition (similar to <codeph>TRUNCATE TABLE</codeph> followed by a new <codeph>INSERT</codeph>).
+          Although there is an <codeph>INSERT ... VALUES</codeph> syntax to create a small number of values in
+          a single statement, it is far more efficient to use the <codeph>INSERT ... SELECT</codeph> to copy
+          and transform large amounts of data from one table to another in a single operation.
+          </p>
+        </li>
+
+        <li>
+          <p>
+          You often construct Impala table definitions and data files in some other environment, and then attach
+          Impala so that it can run real-time queries. The same data files and table metadata are shared with other
+          components of the Hadoop ecosystem. In particular, Impala can access tables created by Hive or data
+          inserted by Hive, and Hive can access tables and data produced by Impala. Many other Hadoop components
+          can write files in formats such as Parquet and Avro, that can then be queried by Impala.
+          </p>
+        </li>
+
+        <li>
+          <p>
+          Because Hadoop and Impala are focused on data warehouse-style operations on large data sets, Impala SQL
+          includes some idioms that you might find in the import utilities for traditional database systems. For
+          example, you can create a table that reads comma-separated or tab-separated text files, specifying the
+          separator in the <codeph>CREATE TABLE</codeph> statement. You can create <b>external tables</b> that read
+          existing data files but do not move or transform them.
+          </p>
+        </li>
+
+        <li>
+          <p>
+          Because Impala reads large quantities of data that might not be perfectly tidy and predictable, it does
+          not require length constraints on string data types. For example, you can define a database column as
+          <codeph>STRING</codeph> with unlimited length, rather than <codeph>CHAR(1)</codeph> or
+          <codeph>VARCHAR(64)</codeph>. <ph rev="2.0.0">(Although in Impala 2.0 and later, you can also use
+          length-constrained <codeph>CHAR</codeph> and <codeph>VARCHAR</codeph> types.)</ph>
+          </p>
+        </li>
+
+      </ul>
+
+      <p>
+        <b>Related information:</b> <xref href="impala_langref.xml#langref"/>, especially
+        <xref href="impala_langref_sql.xml#langref_sql"/> and <xref href="impala_functions.xml#builtins"/>
+      </p>
+    </conbody>
+  </concept>
+
+<!-- Bunch of potential concept topics for future consideration. Major areas of Impala modelled on areas of discussion for Oracle Database, and distributed databases in general. -->
+
+  <concept id="intro_datatypes" audience="Cloudera">
+
+    <title>Overview of Impala SQL Data Types</title>
+
+    <conbody/>
+  </concept>
+
+  <concept id="intro_network" audience="Cloudera">
+
+    <title>Overview of Impala Network Topology</title>
+
+    <conbody/>
+  </concept>
+
+  <concept id="intro_cluster" audience="Cloudera">
+
+    <title>Overview of Impala Cluster Topology</title>
+
+    <conbody/>
+  </concept>
+
+  <concept id="intro_apis">
+
+    <title>Overview of Impala Programming Interfaces</title>
+  <prolog>
+    <metadata>
+      <data name="Category" value="JDBC"/>
+      <data name="Category" value="ODBC"/>
+      <data name="Category" value="Hue"/>
+    </metadata>
+  </prolog>
+
+    <conbody>
+
+      <p>
+        You can connect and submit requests to the Impala daemons through:
+      </p>
+
+      <ul>
+        <li>
+          The <codeph><xref href="impala_impala_shell.xml#impala_shell">impala-shell</xref></codeph> interactive
+          command interpreter.
+        </li>
+
+        <li>
+          The <xref href="http://gethue.com/" scope="external" format="html">Hue</xref> web-based user interface.
+        </li>
+
+        <li>
+          <xref href="impala_jdbc.xml#impala_jdbc">JDBC</xref>.
+        </li>
+
+        <li>
+          <xref href="impala_odbc.xml#impala_odbc">ODBC</xref>.
+        </li>
+      </ul>
+
+      <p>
+        With these options, you can use Impala in heterogeneous environments, with JDBC or ODBC applications
+        running on non-Linux platforms. You can also use Impala on combination with various Business Intelligence
+        tools that use the JDBC and ODBC interfaces.
+      </p>
+
+      <p>
+        Each <codeph>impalad</codeph> daemon process, running on separate nodes in a cluster, listens to
+        <xref href="impala_ports.xml#ports">several ports</xref> for incoming requests. Requests from
+        <codeph>impala-shell</codeph> and Hue are routed to the <codeph>impalad</codeph> daemons through the same
+        port. The <codeph>impalad</codeph> daemons listen on separate ports for JDBC and ODBC requests.
+      </p>
+    </conbody>
+  </concept>
+</concept>

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3be0f122/docs/topics/impala_disable_cached_reads.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_disable_cached_reads.xml b/docs/topics/impala_disable_cached_reads.xml
new file mode 100644
index 0000000..bce6091
--- /dev/null
+++ b/docs/topics/impala_disable_cached_reads.xml
@@ -0,0 +1,36 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
+<concept id="disable_cached_reads" rev="1.4.0">
+
+  <title>DISABLE_CACHED_READS Query Option</title>
+  <prolog>
+    <metadata>
+      <data name="Category" value="Impala"/>
+      <data name="Category" value="Impala Query Options"/>
+      <data name="Category" value="HDFS"/>
+      <data name="Category" value="HDFS Caching"/>
+      <data name="Category" value="Querying"/>
+      <data name="Category" value="Performance"/>
+      <data name="Category" value="Developers"/>
+      <data name="Category" value="Data Analysts"/>
+    </metadata>
+  </prolog>
+
+  <conbody>
+
+    <p>
+      <indexterm audience="Cloudera">DISABLE_CACHED_READS query option</indexterm>
+      Prevents Impala from reading data files that are <q>pinned</q> in memory
+      through the HDFS caching feature. Primarily a debugging option for
+      cases where processing of HDFS cached data is concentrated on a single
+      host, leading to excessive CPU usage on that host.
+    </p>
+
+    <p conref="../shared/impala_common.xml#common/type_boolean"/>
+
+    <p conref="../shared/impala_common.xml#common/default_false"/>
+
+    <p conref="../shared/impala_common.xml#common/added_in_140"/>
+
+  </conbody>
+</concept>

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3be0f122/docs/topics/impala_disable_codegen.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_disable_codegen.xml b/docs/topics/impala_disable_codegen.xml
new file mode 100644
index 0000000..ab306e8
--- /dev/null
+++ b/docs/topics/impala_disable_codegen.xml
@@ -0,0 +1,38 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
+<concept id="disable_codegen">
+
+  <title>DISABLE_CODEGEN Query Option</title>
+  <titlealts audience="PDF"><navtitle>DISABLE_CODEGEN</navtitle></titlealts>
+  <prolog>
+    <metadata>
+      <data name="Category" value="Impala"/>
+      <data name="Category" value="Impala Query Options"/>
+      <data name="Category" value="Troubleshooting"/>
+      <data name="Category" value="Performance"/>
+    </metadata>
+  </prolog>
+
+  <conbody>
+
+    <p>
+      <indexterm audience="Cloudera">DISABLE_CODEGEN query option</indexterm>
+      This is a debug option, intended for diagnosing and working around issues that cause crashes. If a query
+      fails with an <q>illegal instruction</q> or other hardware-specific message, try setting
+      <codeph>DISABLE_CODEGEN=true</codeph> and running the query again. If the query succeeds only when the
+      <codeph>DISABLE_CODEGEN</codeph> option is turned on, submit the problem to <keyword keyref="support_org"/> and include that
+      detail in the problem report. Do not otherwise run with this setting turned on, because it results in lower
+      overall performance.
+    </p>
+
+    <p>
+      Because the code generation phase adds a small amount of overhead for each query, you might turn on the
+      <codeph>DISABLE_CODEGEN</codeph> option to achieve maximum throughput when running many short-lived queries
+      against small tables.
+    </p>
+
+    <p conref="../shared/impala_common.xml#common/type_boolean"/>
+    <p conref="../shared/impala_common.xml#common/default_false_0"/>
+
+  </conbody>
+</concept>

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3be0f122/docs/topics/impala_disable_outermost_topn.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_disable_outermost_topn.xml b/docs/topics/impala_disable_outermost_topn.xml
new file mode 100644
index 0000000..24fcd2a
--- /dev/null
+++ b/docs/topics/impala_disable_outermost_topn.xml
@@ -0,0 +1,29 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
+<concept id="disable_outermost_topn" rev="2.5.0">
+
+  <title>DISABLE_OUTERMOST_TOPN Query Option</title>
+  <prolog>
+    <metadata>
+      <data name="Category" value="Impala"/>
+      <data name="Category" value="Impala Query Options"/>
+      <data name="Category" value="Developers"/>
+      <data name="Category" value="Data Analysts"/>
+    </metadata>
+  </prolog>
+
+  <conbody>
+
+    <p rev="2.5.0">
+      <indexterm audience="Cloudera">DISABLE_OUTERMOST_TOPN query option</indexterm>
+    </p>
+
+    <p>
+      <b>Type:</b>
+    </p>
+
+    <p>
+      <b>Default:</b>
+    </p>
+  </conbody>
+</concept>

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3be0f122/docs/topics/impala_disable_row_runtime_filtering.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_disable_row_runtime_filtering.xml b/docs/topics/impala_disable_row_runtime_filtering.xml
new file mode 100644
index 0000000..a8c07cb
--- /dev/null
+++ b/docs/topics/impala_disable_row_runtime_filtering.xml
@@ -0,0 +1,65 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
+<concept id="disable_row_runtime_filtering" rev="2.5.0">
+
+  <title>DISABLE_ROW_RUNTIME_FILTERING Query Option (<keyword keyref="impala25"/> or higher only)</title>
+  <titlealts audience="PDF"><navtitle>DISABLE_ROW_RUNTIME_FILTERING</navtitle></titlealts>
+  <prolog>
+    <metadata>
+      <data name="Category" value="Impala"/>
+      <data name="Category" value="Impala Query Options"/>
+      <data name="Category" value="Performance"/>
+      <data name="Category" value="Developers"/>
+      <data name="Category" value="Data Analysts"/>
+    </metadata>
+  </prolog>
+
+  <conbody>
+
+    <p rev="2.5.0">
+      <indexterm audience="Cloudera">DISABLE_ROW_RUNTIME_FILTERING query option</indexterm>
+      The <codeph>DISABLE_ROW_RUNTIME_FILTERING</codeph> query option
+      reduces the scope of the runtime filtering feature. Queries still dynamically prune
+      partitions, but do not apply the filtering logic to individual rows within partitions.
+    </p>
+
+    <p>
+      Only applies to queries against Parquet tables. For other file formats, Impala
+      only prunes at the level of partitions, not individual rows.
+    </p>
+
+    <p conref="../shared/impala_common.xml#common/type_boolean"/>
+    <p conref="../shared/impala_common.xml#common/default_false"/>
+
+    <p conref="../shared/impala_common.xml#common/added_in_250"/>
+
+    <p conref="../shared/impala_common.xml#common/usage_notes_blurb"/>
+
+    <p>
+      Impala automatically evaluates whether the per-row filters are being
+      effective at reducing the amount of intermediate data. Therefore,
+      this option is typically only needed for the rare case where Impala
+      cannot accurately determine how effective the per-row filtering is
+      for a query.
+    </p>
+
+    <p conref="../shared/impala_common.xml#common/runtime_filtering_option_caveat"/>
+
+    <p>
+      Because this setting only improves query performance in very specific
+      circumstances, depending on the query characteristics and data distribution,
+      only use it when you determine through benchmarking that it improves
+      performance of specific expensive queries.
+      Consider setting this query option immediately before the expensive query and
+      unsetting it immediately afterward.
+    </p>
+
+    <p conref="../shared/impala_common.xml#common/related_info"/>
+    <p>
+      <xref href="impala_runtime_filtering.xml"/>,
+      <xref href="impala_runtime_filter_mode.xml#runtime_filter_mode"/>
+      <!-- , <xref href="impala_partitioning.xml#dynamic_partition_pruning"/> -->
+    </p>
+
+  </conbody>
+</concept>

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3be0f122/docs/topics/impala_disable_streaming_preaggregations.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_disable_streaming_preaggregations.xml b/docs/topics/impala_disable_streaming_preaggregations.xml
new file mode 100644
index 0000000..a3919af
--- /dev/null
+++ b/docs/topics/impala_disable_streaming_preaggregations.xml
@@ -0,0 +1,45 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
+<concept id="disable_streaming_preaggregations" rev="2.5.0 IMPALA-1305">
+
+  <title>DISABLE_STREAMING_PREAGGREGATIONS Query Option (<keyword keyref="impala25"/> or higher only)</title>
+  <titlealts audience="PDF"><navtitle>DISABLE_STREAMING_PREAGGREGATIONS</navtitle></titlealts>
+  <prolog>
+    <metadata>
+      <data name="Category" value="Impala"/>
+      <data name="Category" value="Impala Query Options"/>
+      <data name="Category" value="Performance"/>
+      <data name="Category" value="Aggregate Functions"/>
+      <data name="Category" value="Troubleshooting"/>
+      <data name="Category" value="Developers"/>
+      <data name="Category" value="Data Analysts"/>
+    </metadata>
+  </prolog>
+
+  <conbody>
+
+    <p rev="2.5.0 IMPALA-1305">
+      <indexterm audience="Cloudera">DISABLE_STREAMING_PREAGGREGATIONS query option</indexterm>
+      Turns off the <q>streaming preaggregation</q> optimization that is available in <keyword keyref="impala25_full"/>
+      and higher. This optimization reduces unnecessary work performed by queries that perform aggregation
+      operations on columns with few or no duplicate values, for example <codeph>DISTINCT <varname>id_column</varname></codeph>
+      or <codeph>GROUP BY <varname>unique_column</varname></codeph>. If the optimization causes regressions in
+      existing queries that use aggregation functions, you can turn it off as needed by setting this query option.
+    </p>
+
+    <p conref="../shared/impala_common.xml#common/type_boolean"/>
+    <p conref="../shared/impala_common.xml#common/default_false_0"/>
+
+    <note conref="../shared/impala_common.xml#common/one_but_not_true"/>
+
+    <p conref="../shared/impala_common.xml#common/usage_notes_blurb"/>
+    <p>
+      Typically, queries that would require enabling this option involve very large numbers of
+      aggregated values, such as a billion or more distinct keys being processed on each
+      worker node.
+    </p>
+
+    <p conref="../shared/impala_common.xml#common/added_in_250"/>
+
+  </conbody>
+</concept>

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3be0f122/docs/topics/impala_disable_unsafe_spills.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_disable_unsafe_spills.xml b/docs/topics/impala_disable_unsafe_spills.xml
new file mode 100644
index 0000000..14b687c
--- /dev/null
+++ b/docs/topics/impala_disable_unsafe_spills.xml
@@ -0,0 +1,53 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
+<concept rev="2.0.0" id="disable_unsafe_spills">
+
+  <title>DISABLE_UNSAFE_SPILLS Query Option (<keyword keyref="impala20"/> or higher only)</title>
+  <titlealts audience="PDF"><navtitle>DISABLE_UNSAFE_SPILLS</navtitle></titlealts>
+  <prolog>
+    <metadata>
+      <data name="Category" value="Impala"/>
+      <data name="Category" value="Impala Query Options"/>
+      <data name="Category" value="Performance"/>
+      <data name="Category" value="Scalability"/>
+      <data name="Category" value="Memory"/>
+      <data name="Category" value="Developers"/>
+      <data name="Category" value="Data Analysts"/>
+    </metadata>
+  </prolog>
+
+  <conbody>
+
+    <p rev="2.0.0">
+      <indexterm audience="Cloudera">DISABLE_UNSAFE_SPILLS query option</indexterm>
+      Enable this option if you prefer to have queries fail when they exceed the Impala memory limit, rather than
+      write temporary data to disk.
+    </p>
+
+    <p>
+      Queries that <q>spill</q> to disk typically complete successfully, when in earlier Impala releases they would have failed.
+      However, queries with exorbitant memory requirements due to missing statistics or inefficient join clauses could
+      become so slow as a result that you would rather have them cancelled automatically and reduce the memory
+      usage through standard Impala tuning techniques.
+    </p>
+
+    <p>
+      This option prevents only <q>unsafe</q> spill operations, meaning that one or more tables are missing
+      statistics or the query does not include a hint to set the most efficient mechanism for a join or
+      <codeph>INSERT ... SELECT</codeph> into a partitioned table. These are the tables most likely to result in
+      suboptimal execution plans that could cause unnecessary spilling. Therefore, leaving this option enabled is a
+      good way to find tables on which to run the <codeph>COMPUTE STATS</codeph> statement.
+    </p>
+
+    <p>
+      See <xref href="impala_scalability.xml#spill_to_disk"/> for information about the <q>spill to disk</q>
+      feature for queries processing large result sets with joins, <codeph>ORDER BY</codeph>, <codeph>GROUP
+      BY</codeph>, <codeph>DISTINCT</codeph>, aggregation functions, or analytic functions.
+    </p>
+
+    <p conref="../shared/impala_common.xml#common/type_boolean"/>
+    <p conref="../shared/impala_common.xml#common/default_false_0"/>
+
+    <p conref="../shared/impala_common.xml#common/added_in_20"/>
+  </conbody>
+</concept>

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3be0f122/docs/topics/impala_disk_space.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_disk_space.xml b/docs/topics/impala_disk_space.xml
new file mode 100644
index 0000000..b6daaeb
--- /dev/null
+++ b/docs/topics/impala_disk_space.xml
@@ -0,0 +1,129 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
+<concept id="disk_space">
+
+  <title>Managing Disk Space for Impala Data</title>
+  <titlealts audience="PDF"><navtitle>Managing Disk Space</navtitle></titlealts>
+  <prolog>
+    <metadata>
+      <data name="Category" value="Impala"/>
+      <data name="Category" value="Disk Storage"/>
+      <data name="Category" value="Administrators"/>
+      <data name="Category" value="Developers"/>
+      <data name="Category" value="Data Analysts"/>
+      <data name="Category" value="Tables"/>
+      <data name="Category" value="Compression"/>
+    </metadata>
+  </prolog>
+
+  <conbody>
+
+    <p>
+      Although Impala typically works with many large files in an HDFS storage system with plenty of capacity,
+      there are times when you might perform some file cleanup to reclaim space, or advise developers on techniques
+      to minimize space consumption and file duplication.
+    </p>
+
+    <ul>
+      <li>
+        <p>
+          Use compact binary file formats where practical. Numeric and time-based data in particular can be stored
+          in more compact form in binary data files. Depending on the file format, various compression and encoding
+          features can reduce file size even further. You can specify the <codeph>STORED AS</codeph> clause as part
+          of the <codeph>CREATE TABLE</codeph> statement, or <codeph>ALTER TABLE</codeph> with the <codeph>SET
+          FILEFORMAT</codeph> clause for an existing table or partition within a partitioned table. See
+          <xref href="impala_file_formats.xml#file_formats"/> for details about file formats, especially
+          <xref href="impala_parquet.xml#parquet"/>. See <xref href="impala_create_table.xml#create_table"/> and
+          <xref href="impala_alter_table.xml#alter_table"/> for syntax details.
+        </p>
+      </li>
+
+      <li>
+        <p>
+          You manage underlying data files differently depending on whether the corresponding Impala table is
+          defined as an <xref href="impala_tables.xml#internal_tables">internal</xref> or
+          <xref href="impala_tables.xml#external_tables">external</xref> table:
+        </p>
+        <ul>
+          <li>
+            Use the <codeph>DESCRIBE FORMATTED</codeph> statement to check if a particular table is internal
+            (managed by Impala) or external, and to see the physical location of the data files in HDFS. See
+            <xref href="impala_describe.xml#describe"/> for details.
+          </li>
+
+          <li>
+            For Impala-managed (<q>internal</q>) tables, use <codeph>DROP TABLE</codeph> statements to remove
+            data files. See <xref href="impala_drop_table.xml#drop_table"/> for details.
+          </li>
+
+          <li>
+            For tables not managed by Impala (<q>external</q> tables), use appropriate HDFS-related commands such
+            as <codeph>hadoop fs</codeph>, <codeph>hdfs dfs</codeph>, or <codeph>distcp</codeph>, to create, move,
+            copy, or delete files within HDFS directories that are accessible by the <codeph>impala</codeph> user.
+            Issue a <codeph>REFRESH <varname>table_name</varname></codeph> statement after adding or removing any
+            files from the data directory of an external table. See <xref href="impala_refresh.xml#refresh"/> for
+            details.
+          </li>
+
+          <li>
+            Use external tables to reference HDFS data files in their original location. With this technique, you
+            avoid copying the files, and you can map more than one Impala table to the same set of data files. When
+            you drop the Impala table, the data files are left undisturbed. See
+            <xref href="impala_tables.xml#external_tables"/> for details.
+          </li>
+
+          <li>
+            Use the <codeph>LOAD DATA</codeph> statement to move HDFS files into the data directory for an Impala
+            table from inside Impala, without the need to specify the HDFS path of the destination directory. This
+            technique works for both internal and external tables. See
+            <xref href="impala_load_data.xml#load_data"/> for details.
+          </li>
+        </ul>
+      </li>
+
+      <li>
+        <p>
+          Make sure that the HDFS trashcan is configured correctly. When you remove files from HDFS, the space
+          might not be reclaimed for use by other files until sometime later, when the trashcan is emptied. See
+          <xref href="impala_drop_table.xml#drop_table"/> and the FAQ entry
+          <xref href="impala_faq.xml#faq_sql/faq_drop_table_space"/> for details. See
+          <xref href="impala_prereqs.xml#prereqs_account"/> for permissions needed for the HDFS trashcan to operate
+          correctly.
+        </p>
+      </li>
+
+      <li>
+        <p>
+          Drop all tables in a database before dropping the database itself. See
+          <xref href="impala_drop_database.xml#drop_database"/> for details.
+        </p>
+      </li>
+
+      <li>
+        <p>
+          Clean up temporary files after failed <codeph>INSERT</codeph> statements. If an <codeph>INSERT</codeph>
+          statement encounters an error, and you see a directory named <filepath>.impala_insert_staging</filepath>
+          or <filepath>_impala_insert_staging</filepath> left behind in the data directory for the table, it might
+          contain temporary data files taking up space in HDFS. You might be able to salvage these data files, for
+          example if they are complete but could not be moved into place due to a permission error. Or, you might
+          delete those files through commands such as <codeph>hadoop fs</codeph> or <codeph>hdfs dfs</codeph>, to
+          reclaim space before re-trying the <codeph>INSERT</codeph>. Issue <codeph>DESCRIBE FORMATTED
+          <varname>table_name</varname></codeph> to see the HDFS path where you can check for temporary files.
+        </p>
+      </li>
+
+      <li rev="1.4.0">
+        <p rev="obwl" conref="../shared/impala_common.xml#common/order_by_scratch_dir"/>
+      </li>
+
+      <li rev="2.2.0">
+        <p>
+          If you use the Amazon Simple Storage Service (S3) as a place to offload
+          data to reduce the volume of local storage, Impala 2.2.0 and higher
+          can query the data directly from S3.
+          See <xref href="impala_s3.xml#s3"/> for details.
+        </p>
+      </li>
+    </ul>
+  </conbody>
+</concept>

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3be0f122/docs/topics/impala_distinct.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_distinct.xml b/docs/topics/impala_distinct.xml
new file mode 100644
index 0000000..8661831
--- /dev/null
+++ b/docs/topics/impala_distinct.xml
@@ -0,0 +1,61 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
+<concept id="distinct">
+
+  <title>DISTINCT Operator</title>
+  <prolog>
+    <metadata>
+      <data name="Category" value="Impala"/>
+      <data name="Category" value="SQL"/>
+      <data name="Category" value="Querying"/>
+      <data name="Category" value="Aggregate Functions"/>
+      <data name="Category" value="Developers"/>
+      <data name="Category" value="Data Analysts"/>
+    </metadata>
+  </prolog>
+
+  <conbody>
+
+    <p>
+      <indexterm audience="Cloudera">DISTINCT operator</indexterm>
+      The <codeph>DISTINCT</codeph> operator in a <codeph>SELECT</codeph> statement filters the result set to
+      remove duplicates:
+    </p>
+
+<codeblock>-- Returns the unique values from one column.
+-- NULL is included in the set of values if any rows have a NULL in this column.
+select distinct c_birth_country from customer;
+-- Returns the unique combinations of values from multiple columns.
+select distinct c_salutation, c_last_name from customer;</codeblock>
+
+    <p>
+      You can use <codeph>DISTINCT</codeph> in combination with an aggregation function, typically
+      <codeph>COUNT()</codeph>, to find how many different values a column contains:
+    </p>
+
+<codeblock>-- Counts the unique values from one column.
+-- NULL is not included as a distinct value in the count.
+select count(distinct c_birth_country) from customer;
+-- Counts the unique combinations of values from multiple columns.
+select count(distinct c_salutation, c_last_name) from customer;</codeblock>
+
+    <p>
+      One construct that Impala SQL does <i>not</i> support is using <codeph>DISTINCT</codeph> in more than one
+      aggregation function in the same query. For example, you could not have a single query with both
+      <codeph>COUNT(DISTINCT c_first_name)</codeph> and <codeph>COUNT(DISTINCT c_last_name)</codeph> in the
+      <codeph>SELECT</codeph> list.
+    </p>
+
+    <p conref="../shared/impala_common.xml#common/zero_length_strings"/>
+
+    <note conref="../shared/impala_common.xml#common/multiple_count_distinct"/>
+
+    <note>
+      <p>
+        In contrast with some database systems that always return <codeph>DISTINCT</codeph> values in sorted order,
+        Impala does not do any ordering of <codeph>DISTINCT</codeph> values. Always include an <codeph>ORDER
+        BY</codeph> clause if you need the values in alphabetical or numeric sorted order.
+      </p>
+    </note>
+  </conbody>
+</concept>

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3be0f122/docs/topics/impala_dml.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_dml.xml b/docs/topics/impala_dml.xml
new file mode 100644
index 0000000..b6c6399
--- /dev/null
+++ b/docs/topics/impala_dml.xml
@@ -0,0 +1,91 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
+<concept id="dml">
+
+  <title>DML Statements</title>
+  <prolog>
+    <metadata>
+      <data name="Category" value="Impala"/>
+      <data name="Category" value="SQL"/>
+      <data name="Category" value="DML"/>
+      <data name="Category" value="Data Analysts"/>
+      <data name="Category" value="Developers"/>
+      <data name="Category" value="Tables"/>
+      <data name="Category" value="ETL"/>
+      <data name="Category" value="Ingest"/>
+    </metadata>
+  </prolog>
+
+  <conbody>
+
+    <p>
+      DML refers to <q>Data Manipulation Language</q>, a subset of SQL statements that modify the data stored in
+      tables. Because Impala focuses on query performance and leverages the append-only nature of HDFS storage,
+      currently Impala only supports a small set of DML statements:
+    </p>
+
+    <ul>
+      <li>
+        <xref keyref="delete"/>. Works for Kudu tables only.
+      </li>
+
+      <li>
+        <xref keyref="insert"/>.
+      </li>
+
+      <li>
+        <xref keyref="load_data"/>. Does not apply for HBase or Kudu tables.
+      </li>
+
+      <li>
+        <xref keyref="update"/>. Works for Kudu tables only.
+      </li>
+
+      <li>
+        <xref keyref="upsert"/>. Works for Kudu tables only.
+      </li>
+    </ul>
+
+    <p>
+      <codeph>INSERT</codeph> in Impala is primarily optimized for inserting large volumes of data in a single
+      statement, to make effective use of the multi-megabyte HDFS blocks. This is the way in Impala to create new
+      data files. If you intend to insert one or a few rows at a time, such as using the <codeph>INSERT ...
+      VALUES</codeph> syntax, that technique is much more efficient for Impala tables stored in HBase. See
+      <xref href="impala_hbase.xml#impala_hbase"/> for details.
+    </p>
+
+    <p>
+      <codeph>LOAD DATA</codeph> moves existing data files into the directory for an Impala table, making them
+      immediately available for Impala queries. This is one way in Impala to work with data files produced by other
+      Hadoop components. (<codeph>CREATE EXTERNAL TABLE</codeph> is the other alternative; with external tables,
+      you can query existing data files, while the files remain in their original location.)
+    </p>
+
+    <p>
+      In <keyword keyref="impala28_full"/> and higher, Impala does support the <codeph>UPDATE</codeph>, <codeph>DELETE</codeph>,
+      and <codeph>UPSERT</codeph> statements for Kudu tables.
+      For HDFS or S3 tables, to simulate the effects of an <codeph>UPDATE</codeph> or <codeph>DELETE</codeph> statement
+      in other database systems, typically you use <codeph>INSERT</codeph> or <codeph>CREATE TABLE AS SELECT</codeph> to copy data
+      from one table to another, filtering out or changing the appropriate rows during the copy operation.
+    </p>
+
+    <p>
+      You can also achieve a result similar to <codeph>UPDATE</codeph> by using Impala tables stored in HBase.
+      When you insert a row into an HBase table, and the table
+      already contains a row with the same value for the key column, the older row is hidden, effectively the same
+      as a single-row <codeph>UPDATE</codeph>.
+    </p>
+
+    <p rev="2.6.0">
+      Impala can perform DML operations for tables or partitions stored in the Amazon S3 filesystem
+      with <keyword keyref="impala26_full"/> and higher. See <xref href="impala_s3.xml#s3"/> for details.
+    </p>
+
+    <p conref="../shared/impala_common.xml#common/related_info"/>
+
+    <p>
+      The other major classifications of SQL statements are data definition language (see
+      <xref href="impala_ddl.xml#ddl"/>) and queries (see <xref href="impala_select.xml#select"/>).
+    </p>
+  </conbody>
+</concept>

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3be0f122/docs/topics/impala_double.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_double.xml b/docs/topics/impala_double.xml
new file mode 100644
index 0000000..c69eae2
--- /dev/null
+++ b/docs/topics/impala_double.xml
@@ -0,0 +1,100 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
+<concept id="double">
+
+  <title>DOUBLE Data Type</title>
+  <titlealts audience="PDF"><navtitle>DOUBLE</navtitle></titlealts>
+  <prolog>
+    <metadata>
+      <data name="Category" value="Impala"/>
+      <data name="Category" value="Impala Data Types"/>
+      <data name="Category" value="SQL"/>
+      <data name="Category" value="Data Analysts"/>
+      <data name="Category" value="Developers"/>
+      <data name="Category" value="Schemas"/>
+    </metadata>
+  </prolog>
+
+  <conbody>
+
+    <p>
+      A double precision floating-point data type used in <codeph>CREATE TABLE</codeph> and <codeph>ALTER
+      TABLE</codeph> statements.
+    </p>
+
+    <p conref="../shared/impala_common.xml#common/syntax_blurb"/>
+
+    <p>
+      In the column definition of a <codeph>CREATE TABLE</codeph> statement:
+    </p>
+
+<codeblock><varname>column_name</varname> DOUBLE</codeblock>
+
+    <p>
+      <b>Range:</b> 4.94065645841246544e-324d .. 1.79769313486231570e+308, positive or negative
+    </p>
+
+    <p>
+      <b>Precision:</b> 15 to 17 significant digits, depending on usage. The number of significant digits does
+      not depend on the position of the decimal point.
+    </p>
+
+    <p>
+      <b>Representation:</b> The values are stored in 8 bytes, using
+      <xref href="https://en.wikipedia.org/wiki/Double-precision_floating-point_format" scope="external" format="html">IEEE 754 Double Precision Binary Floating Point</xref> format.
+    </p>
+
+    <p>
+      <b>Conversions:</b> Impala does not automatically convert <codeph>DOUBLE</codeph> to any other type. You can
+      use <codeph>CAST()</codeph> to convert <codeph>DOUBLE</codeph> values to <codeph>FLOAT</codeph>,
+      <codeph>TINYINT</codeph>, <codeph>SMALLINT</codeph>, <codeph>INT</codeph>, <codeph>BIGINT</codeph>,
+      <codeph>STRING</codeph>, <codeph>TIMESTAMP</codeph>, or <codeph>BOOLEAN</codeph>. You can use exponential
+      notation in <codeph>DOUBLE</codeph> literals or when casting from <codeph>STRING</codeph>, for example
+      <codeph>1.0e6</codeph> to represent one million.
+      <ph conref="../shared/impala_common.xml#common/cast_int_to_timestamp"/>
+    </p>
+
+    <p conref="../shared/impala_common.xml#common/usage_notes_blurb"/>
+
+    <p>
+      The data type <codeph>REAL</codeph> is an alias for <codeph>DOUBLE</codeph>.
+    </p>
+
+    <p conref="../shared/impala_common.xml#common/example_blurb"/>
+
+<codeblock>CREATE TABLE t1 (x DOUBLE);
+SELECT CAST(1000.5 AS DOUBLE);
+</codeblock>
+
+    <p conref="../shared/impala_common.xml#common/partitioning_imprecise"/>
+
+    <p conref="../shared/impala_common.xml#common/hbase_ok"/>
+
+    <p conref="../shared/impala_common.xml#common/parquet_ok"/>
+
+    <p conref="../shared/impala_common.xml#common/text_bulky"/>
+
+<!-- <p conref="../shared/impala_common.xml#common/compatibility_blurb"/> -->
+
+    <p conref="../shared/impala_common.xml#common/internals_8_bytes"/>
+
+<!-- <p conref="../shared/impala_common.xml#common/added_in_20"/> -->
+
+    <p conref="../shared/impala_common.xml#common/column_stats_constant"/>
+
+    <p conref="../shared/impala_common.xml#common/restrictions_blurb"/>
+
+<!-- This conref appears under SUM(), AVG(), FLOAT, and DOUBLE topics. -->
+
+    <p conref="../shared/impala_common.xml#common/sum_double"/>
+
+    <p conref="../shared/impala_common.xml#common/float_double_decimal_caveat"/>
+
+    <p conref="../shared/impala_common.xml#common/related_info"/>
+
+    <p>
+      <xref href="impala_literals.xml#numeric_literals"/>, <xref href="impala_math_functions.xml#math_functions"/>,
+      <xref href="impala_float.xml#float"/>
+    </p>
+  </conbody>
+</concept>

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3be0f122/docs/topics/impala_drop_data_source.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_drop_data_source.xml b/docs/topics/impala_drop_data_source.xml
new file mode 100644
index 0000000..be6167c
--- /dev/null
+++ b/docs/topics/impala_drop_data_source.xml
@@ -0,0 +1,35 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
+<concept audience="Cloudera" rev="1.4.0" id="drop_data_source">
+
+  <title>DROP DATA SOURCE Statement</title>
+  <titlealts audience="PDF"><navtitle>DROP DATA SOURCE</navtitle></titlealts>
+  <prolog>
+    <metadata>
+      <data name="Category" value="Impala"/>
+      <data name="Category" value="SQL"/>
+      <data name="Category" value="DDL"/>
+      <data name="Category" value="Developers"/>
+      <data name="Category" value="Data Analysts"/>
+    </metadata>
+  </prolog>
+
+  <conbody>
+
+    <p>
+      <indexterm audience="Cloudera">DROP DATA SOURCE statement</indexterm>
+    </p>
+
+    <p conref="../shared/impala_common.xml#common/syntax_blurb"/>
+
+    <p conref="../shared/impala_common.xml#common/ddl_blurb"/>
+
+    <p conref="../shared/impala_common.xml#common/related_info"/>
+
+    <p conref="../shared/impala_common.xml#common/usage_notes_blurb"/>
+
+    <p conref="../shared/impala_common.xml#common/example_blurb"/>
+
+    <p conref="../shared/impala_common.xml#common/cancel_blurb_no"/>
+  </conbody>
+</concept>

http://git-wip-us.apache.org/repos/asf/incubator-impala/blob/3be0f122/docs/topics/impala_drop_database.xml
----------------------------------------------------------------------
diff --git a/docs/topics/impala_drop_database.xml b/docs/topics/impala_drop_database.xml
new file mode 100644
index 0000000..fb36dca
--- /dev/null
+++ b/docs/topics/impala_drop_database.xml
@@ -0,0 +1,130 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
+<concept id="drop_database">
+
+  <title>DROP DATABASE Statement</title>
+  <titlealts audience="PDF"><navtitle>DROP DATABASE</navtitle></titlealts>
+  <prolog>
+    <metadata>
+      <data name="Category" value="Impala"/>
+      <data name="Category" value="SQL"/>
+      <data name="Category" value="Databases"/>
+      <data name="Category" value="DDL"/>
+      <data name="Category" value="Schemas"/>
+      <data name="Category" value="Developers"/>
+      <data name="Category" value="Data Analysts"/>
+    </metadata>
+  </prolog>
+
+  <conbody>
+
+    <p>
+      <indexterm audience="Cloudera">DROP DATABASE statement</indexterm>
+      Removes a database from the system. The physical operations involve removing the metadata for the database
+      from the metastore, and deleting the corresponding <codeph>*.db</codeph> directory from HDFS.
+    </p>
+
+    <p conref="../shared/impala_common.xml#common/syntax_blurb"/>
+
+<codeblock>DROP (DATABASE|SCHEMA) [IF EXISTS] <varname>database_name</varname> <ph rev="2.3.0">[RESTRICT | CASCADE]</ph>;</codeblock>
+
+    <p conref="../shared/impala_common.xml#common/ddl_blurb"/>
+
+    <p conref="../shared/impala_common.xml#common/usage_notes_blurb"/>
+
+    <p>
+      By default, the database must be empty before it can be dropped, to avoid losing any data.
+    </p>
+
+    <p rev="2.3.0">
+      In <keyword keyref="impala23_full"/> and higher, you can include the <codeph>CASCADE</codeph>
+      clause to make Impala drop all tables and other objects in the database before dropping the database itself.
+      The <codeph>RESTRICT</codeph> clause enforces the original requirement that the database be empty
+      before being dropped. Because the <codeph>RESTRICT</codeph> behavior is still the default, this
+      clause is optional.
+    </p>
+
+    <p rev="2.3.0">
+      The automatic dropping resulting from the <codeph>CASCADE</codeph> clause follows the same rules as the
+      corresponding <codeph>DROP TABLE</codeph>, <codeph>DROP VIEW</codeph>, and <codeph>DROP FUNCTION</codeph> statements.
+      In particular, the HDFS directories and data files for any external tables are left behind when the
+      tables are removed.
+    </p>
+
+    <p>
+      When you do not use the <codeph>CASCADE</codeph> clause, drop or move all the objects inside the database manually
+      before dropping the database itself:
+    </p>
+
+    <ul>
+      <li>
+        <p>
+          Use the <codeph>SHOW TABLES</codeph> statement to locate all tables and views in the database,
+          and issue <codeph>DROP TABLE</codeph> and <codeph>DROP VIEW</codeph> statements to remove them all.
+        </p>
+      </li>
+      <li>
+        <p>
+          Use the <codeph>SHOW FUNCTIONS</codeph> and <codeph>SHOW AGGREGATE FUNCTIONS</codeph> statements
+          to locate all user-defined functions in the database, and issue <codeph>DROP FUNCTION</codeph>
+          and <codeph>DROP AGGREGATE FUNCTION</codeph> statements to remove them all.
+        </p>
+      </li>
+      <li>
+        <p>
+          To keep tables or views contained by a database while removing the database itself, use
+          <codeph>ALTER TABLE</codeph> and <codeph>ALTER VIEW</codeph> to move the relevant
+          objects to a different database before dropping the original database.
+        </p>
+      </li>
+    </ul>
+
+    <p>
+      You cannot drop the current database, that is, the database your session connected to
+      either through the <codeph>USE</codeph> statement or the <codeph>-d</codeph> option of <cmdname>impala-shell</cmdname>.
+      Issue a <codeph>USE</codeph> statement to switch to a different database first.
+      Because the <codeph>default</codeph> database is always available, issuing
+      <codeph>USE default</codeph> is a convenient way to leave the current database
+      before dropping it.
+    </p>
+
+    <p conref="../shared/impala_common.xml#common/hive_blurb"/>
+
+    <p>
+      When you drop a database in Impala, the database can no longer be used by Hive.
+    </p>
+
+    <p conref="../shared/impala_common.xml#common/example_blurb"/>
+
+<!-- Better to conref the same examples in both places. -->
+
+    <p>
+      See <xref href="impala_create_database.xml#create_database"/> for examples covering <codeph>CREATE
+      DATABASE</codeph>, <codeph>USE</codeph>, and <codeph>DROP DATABASE</codeph>.
+    </p>
+
+    <p conref="../shared/impala_common.xml#common/s3_blurb"/>
+
+    <p conref="../shared/impala_common.xml#common/s3_ddl"/>
+
+    <p conref="../shared/impala_common.xml#common/cancel_blurb_no"/>
+
+    <p conref="../shared/impala_common.xml#common/permissions_blurb"/>
+    <p rev="CDH-19187">
+      The user ID that the <cmdname>impalad</cmdname> daemon runs under,
+      typically the <codeph>impala</codeph> user, must have write
+      permission for the directory associated with the database.
+    </p>
+
+    <p conref="../shared/impala_common.xml#common/example_blurb"/>
+
+    <codeblock conref="../shared/impala_common.xml#common/create_drop_db_example"/>
+
+    <p conref="../shared/impala_common.xml#common/related_info"/>
+
+    <p>
+      <xref href="impala_databases.xml#databases"/>, <xref href="impala_create_database.xml#create_database"/>,
+      <xref href="impala_use.xml#use"/>, <xref href="impala_show.xml#show_databases"/>, <xref href="impala_drop_table.xml#drop_table"/>
+    </p>
+  </conbody>
+</concept>


Mime
View raw message