drill-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From tshi...@apache.org
Subject [2/3] drill git commit: revise Install Drill, fix links
Date Mon, 04 May 2015 22:11:28 GMT
http://git-wip-us.apache.org/repos/asf/drill/blob/56894cd0/_docs/manage-drill/011-configuring-drill-in-a-dedicated-cluster.md
----------------------------------------------------------------------
diff --git a/_docs/manage-drill/011-configuring-drill-in-a-dedicated-cluster.md b/_docs/manage-drill/011-configuring-drill-in-a-dedicated-cluster.md
index 086f207..446f75c 100644
--- a/_docs/manage-drill/011-configuring-drill-in-a-dedicated-cluster.md
+++ b/_docs/manage-drill/011-configuring-drill-in-a-dedicated-cluster.md
@@ -27,4 +27,4 @@ env.sh`.
 
 {% include startnote.html %}If this parameter is not set, the limit depends on the amount
of available system memory.{% include endnote.html %}
 
-After you edit `<drill_installation_directory>/conf/drill-env.sh`, [restart the Drillbit]({{
site.baseurl }}/docs/starting-stopping-drill#starting-a-drillbit) onthe node.
\ No newline at end of file
+After you edit `<drill_installation_directory>/conf/drill-env.sh`, [restart the Drillbit]({{
site.baseurl }}/docs/starting-drill-in-distributed-mode) on the node.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/56894cd0/_docs/manage-drill/030-start-stop.md
----------------------------------------------------------------------
diff --git a/_docs/manage-drill/030-start-stop.md b/_docs/manage-drill/030-start-stop.md
index 561995a..591b6ab 100644
--- a/_docs/manage-drill/030-start-stop.md
+++ b/_docs/manage-drill/030-start-stop.md
@@ -2,56 +2,17 @@
 title: "Starting/Stopping Drill"
 parent: "Manage Drill"
 ---
-How you start Drill depends on the installation method you followed. If you installed Drill
in embedded mode, invoking SQLLine automatically starts a Drillbit locally. 
-
-On a MapR cluster, Drill runs as a service and the installation process starts the Drillbit
service automatically. If you installed Drill in distributed mode, and the Drillbit on a node
did not start, start the Drillbit before attempting to run queries.
-
-
-## Controlling a Drillbit
-
-The Drillbit service accepts requests from the client, processing the queries, and returning
results to the client. You install Drill as a service and run the Drillbit on all of the required
nodes in a Hadoop cluster to form a distributed cluster environment. When a Drillbit runs
on each data node in the cluster, Drill maximizes data locality during query execution. Movement
of data over the network or between nodes is minimized or eliminated when possible.
-
-If you use Drill in distributed mode, you need to understand how to control a Drillbit. If
you use Drill in embedded mode, you do not use the **drillbit** command. Windows and Mac OS
X run Drill only in embedded mode, and therefore do not use the Drillbit command.
-
-Using the **drillbit command**, located in the `bin` directory, you check the status of the
Drillbit, start, stop, and restart a DrillBit. You can use a configuration file to start Drill.
Using such a file is handy for controlling Drillbits on multiple nodes.
-
-### drillbit Command Syntax
-
-    drillbit.sh [--config <conf-dir>] (start|stop|status|restart|autorestart)
-
-For example, to restart a Drillbit, navigate to the Drill installation directory, and issue
the following command:
-
-    bin/drillbit.sh restart
-
-## Invoking SQLLine
-SQLLine is used as the Drill shell. SQLLine connects to relational databases and executes
SQL commands. You invoke SQLLine for Drill in embedded or distributed mode. If you want to
use a particular storage plugin, you specify the plugin as a schema when you invoke SQLLine.
-
-### SQLLine Command Syntax on Linux and Mac OS X
-To start SQLLine, use the following **sqlline command** syntax:
-
-
-    sqlline –u jdbc:drill:[schema=<storage plugin>;]zk=<zk name>[:<port>][,<zk
name2>[:<port>]... ]
-
-#### sqlline Arguments 
-
-* `-u` is the option that precedes a connection string. Required.  
-* `jdbc` is the connection protocol. Required.  
-* `schema` is the name of a [storage plugin]({{site.baseurl}}/docs/storage-plugin-registration)
to use for queries. Optional.  
-* `Zk=zkname` is one or more zookeeper host names or IP addresses, or the keyword `local`,
which is an alias for localhost. Required.  
-* `port` is the zookeeper port number. Optional. Port 2181 is the default.  
-
-### SQLLine Command Syntax on Windows
-To start SQLLine on Windows, use the same syntax as Linux and Mac OS X, except enter the
default user name and password (admin/admin) when prompted to do so.
+How you start Drill depends on the installation method you followed. If you installed Drill
in embedded mode, invoking SQLLine automatically starts a Drillbit locally. If you installed
Drill in distributed mode, and the Drillbit on a node did not start, start the Drillbit before
attempting to run queries. How to start Drill is covered in detail the section, ["Install
Drill"]({{site.baseurl}}/docs/install-drill/).
 
 ## Examples of Starting Drill
 Issue the **sqlline** command from the Drill installation directory. The simplest example
of how to start SQLLine is to identify the protocol, JDBC, and zookeeper node or nodes in
the **sqlline** command. This example starts SQLLine on a node in an embedded, single-node
cluster:
 
     sqlline -u jdbc:drill:zk=local
 
-This example also starts SQLLine in embedded mode using the `dfs` storage plugin. Specifying
the storage plugin when you start up eliminates the need to specify the storage plugin in
the query:
+This example also starts SQLLine using the `dfs` storage plugin. Specifying the storage plugin
when you start up eliminates the need to specify the storage plugin in the query:
 
 
-    bin/sqlline –u jdbc:drill:schema=dfs;zk=localhost
+    bin/sqlline –u jdbc:drill:schema=dfs;zk=centos26
 
 This command starts SQLLine in distributed, (multi-node) mode in a cluster configured to
run zookeeper on three nodes:
 

http://git-wip-us.apache.org/repos/asf/drill/blob/56894cd0/_docs/query-data/030-querying-hbase.md
----------------------------------------------------------------------
diff --git a/_docs/query-data/030-querying-hbase.md b/_docs/query-data/030-querying-hbase.md
index fd0230c..eecdce4 100644
--- a/_docs/query-data/030-querying-hbase.md
+++ b/_docs/query-data/030-querying-hbase.md
@@ -2,7 +2,7 @@
 title: "Querying HBase"
 parent: "Query Data"
 ---
-This exercise creates two tables in HBase, students and clicks, that you can query with Drill.
As an HBase user, you most likely are running Drill in  distributed mode. In this case, [Warden](http://doc.mapr.com/display/MapR/Apache+Drill+Installation+Overview)
starts Drill as a service. If you are not an HBase user and just kicking the tires, you might
use the Drill Sandbox on a single-node cluster (embedded mode). In this case, you need to
[start Drill]({{ site.baseurl }}/docs/starting-stopping-drill/) before performing step 5 of
this exercise. On the Drill Sandbox, HBase tables you create will be located in: /mapr/demo.mapr.com/tables
+This exercise creates two tables in HBase, students and clicks, that you can query with Drill.
As an HBase user, you most likely are running Drill in  distributed mode. In this case, [Warden](http://doc.mapr.com/display/MapR/Apache+Drill+Installation+Overview)
starts Drill as a service. If you are not an HBase user and just kicking the tires, you might
use the Drill Sandbox on a single-node cluster (embedded mode). In this case, you need to
[start Drill]({{ site.baseurl }}/docs/install-drill/) before performing step 5 of this exercise.
On the Drill Sandbox, HBase tables you create will be located in: /mapr/demo.mapr.com/tables
 
 You use the CONVERT_TO and CONVERT_FROM functions to convert binary text to readable output.
You use the CAST function to convert the binary INT to readable output in step 4 of [Query
HBase Tables]({{site.baseurl}}/docs/querying-hbase/#query-hbase-tables). When converting an
INT or BIGINT number, having a byte count in the destination/source that does not match the
byte count of the number in the VARBINARY source/destination, use CAST.
 

http://git-wip-us.apache.org/repos/asf/drill/blob/56894cd0/_docs/query-data/050-querying-hive.md
----------------------------------------------------------------------
diff --git a/_docs/query-data/050-querying-hive.md b/_docs/query-data/050-querying-hive.md
index 1099f31..080492f 100644
--- a/_docs/query-data/050-querying-hive.md
+++ b/_docs/query-data/050-querying-hive.md
@@ -18,7 +18,7 @@ To create a Hive table and query it with Drill, complete the following steps:
 
         hive> load data local inpath '/<directory path>/customers.csv' overwrite
into table customers;`
   4. Issue `quit` or `exit` to leave the Hive shell.
-  5. Start Drill. Refer to [/docs/starting-stopping-drill) for instructions.
+  5. Start Drill. Refer to [/docs/install-drill) for instructions.
   6. Issue the following query to Drill to get the first and last names of the first ten
customers in the Hive table:  
 
         0: jdbc:drill:schema=hiveremote> SELECT firstname,lastname FROM hiveremote.`customers`
limit 10;`

http://git-wip-us.apache.org/repos/asf/drill/blob/56894cd0/_docs/tutorials/020-drill-in-10-minutes.md
----------------------------------------------------------------------
diff --git a/_docs/tutorials/020-drill-in-10-minutes.md b/_docs/tutorials/020-drill-in-10-minutes.md
index fe22d5f..01f8bca 100755
--- a/_docs/tutorials/020-drill-in-10-minutes.md
+++ b/_docs/tutorials/020-drill-in-10-minutes.md
@@ -9,59 +9,6 @@ Use Apache Drill to query sample data in 10 minutes. For simplicity, you’ll
 run Drill in _embedded_ mode rather than _distributed_ mode to try out Drill
 without having to perform any setup tasks.
 
-## A Few Bits About Apache Drill
-
-Drill is a clustered, powerful MPP (Massively Parallel Processing) query
-engine for Hadoop that can process petabytes of data, fast. Drill is useful
-for short, interactive ad-hoc queries on large-scale data sets. Drill is
-capable of querying nested data in formats like JSON and Parquet and
-performing dynamic schema discovery. Drill does not require a centralized
-metadata repository.
-
-### **_Dynamic schema discovery_**
-
-Drill does not require schema or type specification for data in order to start
-the query execution process. Drill starts data processing in record-batches
-and discovers the schema during processing. Self-describing data formats such
-as Parquet, JSON, AVRO, and NoSQL databases have schema specified as part of
-the data itself, which Drill leverages dynamically at query time. Because
-schema can change over the course of a Drill query, all Drill operators are
-designed to reconfigure themselves when schemas change.
-
-### **_Flexible data model_**
-
-Drill allows access to nested data attributes, just like SQL columns, and
-provides intuitive extensions to easily operate on them. From an architectural
-point of view, Drill provides a flexible hierarchical columnar data model that
-can represent complex, highly dynamic and evolving data models. Drill allows
-for efficient processing of these models without the need to flatten or
-materialize them at design time or at execution time. Relational data in Drill
-is treated as a special or simplified case of complex/multi-structured data.
-
-### **_De-centralized metadata_**
-
-Drill does not have a centralized metadata requirement. You do not need to
-create and manage tables and views in a metadata repository, or rely on a
-database administrator group for such a function. Drill metadata is derived
-from the storage plugins that correspond to data sources. Storage plugins
-provide a spectrum of metadata ranging from full metadata (Hive), partial
-metadata (HBase), or no central metadata (files). De-centralized metadata
-means that Drill is NOT tied to a single Hive repository. You can query
-multiple Hive repositories at once and then combine the data with information
-from HBase tables or with a file in a distributed file system. You can also
-use SQL DDL syntax to create metadata within Drill, which gets organized just
-like a traditional database. Drill metadata is accessible through the ANSI
-standard INFORMATION_SCHEMA database.
-
-### **_Extensibility_**
-
-Drill provides an extensible architecture at all layers, including the storage
-plugin, query, query optimization/execution, and client API layers. You can
-customize any layer for the specific needs of an organization or you can
-extend the layer to a broader array of use cases. Drill provides a built in
-classpath scanning and plugin concept to add additional storage plugins,
-functions, and operators with minimal configuration.
-
 ## Installation Overview
 
 You can install Drill in embedded mode on a machine running Linux, Mac OS X, or Windows.
For information about running Drill in distributed mode, see  [Deploying Drill in a Cluster]({{
site.baseurl }}/docs/deploying-drill-in-a-cluster).

http://git-wip-us.apache.org/repos/asf/drill/blob/56894cd0/_docs/tutorials/learn-drill-with-the-mapr-sandbox/020-getting-to-know-the-drill-sandbox.md
----------------------------------------------------------------------
diff --git a/_docs/tutorials/learn-drill-with-the-mapr-sandbox/020-getting-to-know-the-drill-sandbox.md
b/_docs/tutorials/learn-drill-with-the-mapr-sandbox/020-getting-to-know-the-drill-sandbox.md
index a9118d7..ec9c53b 100644
--- a/_docs/tutorials/learn-drill-with-the-mapr-sandbox/020-getting-to-know-the-drill-sandbox.md
+++ b/_docs/tutorials/learn-drill-with-the-mapr-sandbox/020-getting-to-know-the-drill-sandbox.md
@@ -20,7 +20,7 @@ Drill includes SQLLine, a JDBC utility for connecting to relational databases
an
 
 In distributed mode, [Warden](http://doc.mapr.com/display/MapR/Apache+Drill+Installation+Overview)
attempts to start Drill automatically when Drill is defined as service.
 
-[Starting SQLLine outside the sandbox]({{ site.baseurl }}/docs/starting-stopping-drill) for
use with Drill requires entering more options than are shown here. When you type sqlline on
the Sandbox command line, a script runs that includes startup options shown in the section,
["Starting/Stopping Drill"](http://apache.github.io/drill/docs/starting-stopping-drill/#invoking-sqlline/connecting-to-a-schema).
+[Starting SQLLine outside the sandbox]({{ site.baseurl }}/docs/install-drill) for use with
Drill requires entering a few options, covered in the section, ["Install Drill"](docs/install-drill/).

 
 In this tutorial you query a number of data sets, including Hive and HBase, and files on
the file system, such as CSV, JSON, and Parquet files. To access these diverse data sources,
you connect Drill to storage plugins. 
 


Mime
View raw message