drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sudheesh Katkam (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-4892) Swift Documentation
Date Fri, 16 Sep 2016 18:32:20 GMT

    [ https://issues.apache.org/jira/browse/DRILL-4892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15497034#comment-15497034
] 

Sudheesh Katkam commented on DRILL-4892:
----------------------------------------

>From email:

{quote}
AFAIK, there is no documentation. I am not sure anyone has tried it before. That said, from
\[1\], Swift enables Apache Hadoop applications - including MapReduce jobs, read and write
data to and from instances of the OpenStack Swift object store. And Drill uses the HDFS client
library. So using Swift through Drill should be possible.

My guess.. Create storage plugin named “swift”, copy the contents from the “dfs” plugin.
I am not sure what the contents of “swift” should be exactly; see \[1\] and \[2\]. The
parameters and values mentioned in the “Configuring” section in \[1\] should be provided
through the “config” map in the storage plugin (or maybe through conf/core-site.xml in
the Drill installation directory).

Something like:
{
  "type": "file",
  "enabled": true,
  "connection": "swift://dmitry.privatecloud/out/results",
  "workspaces": \{
    ...
  \},
  "formats": \{
    ...
  \}
  "config": \{
    ...
  \}
}

A roundabout way could use Swift through S3 \[3\]. Again, I do not know the exact configuration
details.

Once you get things to work, you can also add a section to the Drill docs based on your experience!

Thank you,
Sudheesh

\[1\] https://hadoop.apache.org/docs/stable2/hadoop-openstack/index.html
\[2\] http://drill.apache.org/docs/s3-storage-plugin/
\[3\] https://github.com/openstack/swift3
{quote}

> Swift Documentation
> -------------------
>
>                 Key: DRILL-4892
>                 URL: https://issues.apache.org/jira/browse/DRILL-4892
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Documentation
>    Affects Versions: 1.6.0, 1.8.0
>            Reporter: Matt Keranen
>
> The Drill FAQ (https://drill.apache.org/faq/), suggest Swift is a datasource:
> "Cloud storage: Amazon S3, Google Cloud Storage, Azure Blog Storage, Swift"
> However there appears to be no documentation (?)
> Swift specific docs would be very useful. We have a large Swift installation and using
Drill over files in it would be a valuable feature.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message