drill-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From tshi...@apache.org
Subject [26/30] drill git commit: Create 110-s3-storage-plugin.md
Date Mon, 23 Nov 2015 21:54:09 GMT
Create 110-s3-storage-plugin.md


Project: http://git-wip-us.apache.org/repos/asf/drill/repo
Commit: http://git-wip-us.apache.org/repos/asf/drill/commit/42505c9d
Tree: http://git-wip-us.apache.org/repos/asf/drill/tree/42505c9d
Diff: http://git-wip-us.apache.org/repos/asf/drill/diff/42505c9d

Branch: refs/heads/gh-pages
Commit: 42505c9d785ab4fdddcc5f6f4768777793a57c7c
Parents: df1072b
Author: Abhi <abhipol@users.noreply.github.com>
Authored: Sun Nov 22 17:19:05 2015 -0800
Committer: Tomer Shiran <tshiran@gmail.com>
Committed: Mon Nov 23 10:11:53 2015 -0800

----------------------------------------------------------------------
 .../plugins/110-s3-storage-plugin.md            | 84 ++++++++++++++++++++
 1 file changed, 84 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/drill/blob/42505c9d/_docs/connect-a-data-source/plugins/110-s3-storage-plugin.md
----------------------------------------------------------------------
diff --git a/_docs/connect-a-data-source/plugins/110-s3-storage-plugin.md b/_docs/connect-a-data-source/plugins/110-s3-storage-plugin.md
new file mode 100644
index 0000000..022a6dc
--- /dev/null
+++ b/_docs/connect-a-data-source/plugins/110-s3-storage-plugin.md
@@ -0,0 +1,84 @@
+---
+title: "S3 Storage Plugin"
+parent: "Connect a Data Source"
+---
+Drill works with data stored in the cloud. With a few simple steps, you can configure the
S3 storage plugin for Drill and be off to the races running queries.
+
+## Connecting Drill to S3
+
+Starting with version 1.3.0, Drill has the ability to query files stored on Amazon's S3 cloud
storage using the S3a library. This is important, because S3a adds support for files bigger
than 5 gigabytes (these were unsupported using Drill's previous S3n interface).
+
+There are two simple steps to follow: (1) provide your AWS credentials (2) configure S3 storage
plugin with S3 bucket
+
+#### (1) AWS credentials
+
+To enable Drill's S3a support, edit the file conf/core-site.xml in your Drill install directory,
replacing the text ENTER_YOUR_ACESSKEY and ENTER_YOUR_SECRETKEY with your AWS credentials.
+
+```
+<configuration>
+
+  <property>
+    <name>fs.s3a.access.key</name>
+    <value>ENTER_YOUR_ACCESSKEY</value>
+  </property>
+
+  <property>
+    <name>fs.s3a.secret.key</name>
+    <value>ENTER_YOUR_SECRETKEY</value>
+  </property>
+
+</configuration>
+```
+
+#### (2) Configure S3 Storage Plugin
+
+Enable S3 storage plugin if you already have one configured or you can add a new plugin by
following these steps:
+
+1. Point your browser to http://<host>:8047 and select the 'Storage' tab. (Note: on
a single machine system, you'll need to run drill-embedded before you can access the web console
site)
+2. Duplicate the 'dfs' plugin. To do this, hit 'Update' next to 'dfs,' and then copy the
JSON text that appears.
+3. Create a new storage plugin, and paste in the 'dfs' text.
+4. Replace -- file:/// with s3a://your.bucketname.
+5. Name your new plugin, say s3-\<bucketname\>
+
+You should now be able to talk to data stored on S3 using the S3a library.
+
+## S3 Example
+
+```
+{
+  "type": "file",
+  "enabled": true,
+  "connection": "s3a://apache.drill.cloud.bigdata/",
+  "workspaces": {
+    "root": {
+      "location": "/",
+      "writable": false,
+      "defaultInputFormat": null
+    },
+    "tmp": {
+      "location": "/tmp",
+      "writable": true,
+      "defaultInputFormat": null
+    }
+  },
+  "formats": {
+    "psv": {
+      "type": "text",
+      "extensions": [
+        "tbl"
+      ],
+      "delimiter": "|"
+    },
+    "csv": {
+      "type": "text",
+      "extensions": [
+        "csv"
+      ],
+      "delimiter": ","
+    },
+    ....
+    
+  }
+}
+```
+


Mime
View raw message