asterixdb-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "abdullah alamoudi (Code Review)" <do-not-re...@asterixdb.incubator.apache.org>
Subject Change in asterixdb[master]: Add List of Supported Adapters to Doc
Date Thu, 14 Apr 2016 09:09:13 GMT
abdullah alamoudi has uploaded a new change for review.

  https://asterix-gerrit.ics.uci.edu/802

Change subject: Add List of Supported Adapters to Doc
......................................................................

Add List of Supported Adapters to Doc

Change-Id: I2bb98477e144e78e9983d33f9dd2f89a547aeccf
---
M asterixdb/asterix-doc/src/site/markdown/aql/externaldata.md
M asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/provider/DatasourceFactoryProvider.java
2 files changed, 59 insertions(+), 1 deletion(-)


  git pull ssh://asterix-gerrit.ics.uci.edu:29418/asterixdb refs/changes/02/802/1

diff --git a/asterixdb/asterix-doc/src/site/markdown/aql/externaldata.md b/asterixdb/asterix-doc/src/site/markdown/aql/externaldata.md
index d5281cb..e919bd0 100644
--- a/asterixdb/asterix-doc/src/site/markdown/aql/externaldata.md
+++ b/asterixdb/asterix-doc/src/site/markdown/aql/externaldata.md
@@ -23,6 +23,7 @@
 
 * [Introduction](#Introduction)
 * [Adapter for an External Dataset](#IntroductionAdapterForAnExternalDataset)
+* [Builtin Adapters](#BuiltinAdapters)
 * [Creating an External Dataset](#IntroductionCreatingAnExternalDataset)
 * [Writing Queries against an External Dataset](#WritingQueriesAgainstAnExternalDataset)
 * [Building Indexes over External Datasets](#BuildingIndexesOverExternalDatasets)
@@ -35,8 +36,62 @@
 ### <a id="IntroductionAdapterForAnExternalDataset">Adapter for an External Dataset</a>
<font size="4"><a href="#toc">[Back to TOC]</a></font> ###
 External data is accessed using wrappers (adapters in AsterixDB) that abstract away the mechanism
of connecting with an external service, receiving its data and transforming the data into
ADM records that are understood by AsterixDB. AsterixDB comes with built-in adapters for common
storage systems such as HDFS or the local file system.
 
-### <a id="IntroductionCreatingAnExternalDataset">Creating an External Dataset</a>
<font size="4"><a href="#toc">[Back to TOC]</a></font> ###
+### <a id="BuiltinAdapters">Builtin Adapters</a> <font size="4"><a href="#toc">[Back
to TOC]</a></font> ###
+AsterixDB offers a set of builtin adapters that can be used to query external data or for
loading data into an internal dataset using a load statement or a data feed. Each adapter
requires specifying the format of the data in order to be able to parse records correctly.
Using adapters with feeds, the parameter output-type must also be specified.
 
+Following is a listing of existing built-in adapters and their configuration parameters:
+<ol>
+  <li>localfs: used for reading data stored in a local filesystem in one or more of
the node controllers
+    <ul>
+      <li>path: A fully qualified path of the form host://&lt;absolute path&gt;.
Comma separated list if there are multiple directories or files</li>
+      <li>expression: A regular expression to match and filter against file names</li>
+    </ul>
+  </li>
+  <li>hdfs: used for reading data stored in an HDFS instance.
+    <ul>
+      <li>path: A fully qualified path of the form host://&lt;absolute_path&gt;.
Comma separated list if there are multiple directories or files</li>
+      <li>expression: A regular expression to match and filter against file names</li>
+      <li>input-format: A fully qualified name or an alias for a class of HDFS input
format</li>
+      <li>hdfs: The HDFS name node URL</li>
+    </ul>
+  </li>
+  <li>socket: used for listening to connections that sends data streams through one
or more sockets.
+    <ul>
+      <li>sockets: comma separated list of sockets to listen to</li> 
+      <li>address-type: either IP if the list uses IP addresses, or NC if the list
uses NC names</li>
+    </ul>
+  </li>
+  <li>socket_client: used for connecting to one or more socket and reading data streams.
+    <ul>
+      <li>sockets: comma separated list of sockets to connect to</li>
+    </ul>
+  </li>
+  <li>twitter_push: used for establishing a connection and subscribing to a twitter
feed.
+    <ul>
+      <li>consumer.key: access parameter provided by twitter OAuth</li>
+      <li>consumer.secret: access parameter provided by twitter OAuth</li>
+      <li>access.token: access parameter provided by twitter OAuth</li>
+      <li>access.token.secret: access parameter provided by twitter OAuth</li>
+    </ul>
+  </li>
+  <li>twitter_pull: used for polling a twitter feed for tweets based on a configurable
frequency
+    <ul>
+      <li>consumer.key: access parameter provided by twitter OAuth</li>
+      <li>consumer.secret: access parameter provided by twitter OAuth</li>
+      <li>access.token: access parameter provided by twitter OAuth</li>
+      <li>access.token.secret: access parameter provided by twitter OAuth</li>
+      <li>query: twitter query string</li>
+      <li>interval: poll interval in seconds</li>
+    </ul>
+  </li>
+  <li>rss: used for reading RSS feed
+    <ul>
+      <li>url: a comma separated list of RSS urls</li>
+    </ul>
+  </li>
+</ol>
+
+### <a id="IntroductionCreatingAnExternalDataset">Creating an External Dataset</a>
<font size="4"><a href="#toc">[Back to TOC]</a></font> ###
 As an example we consider the Lineitem dataset from the [TPCH schema](http://www.openlinksw.com/dataspace/doc/dav/wiki/Main/VOSTPCHLinkedData/tpch.sql).
 We assume that you have successfully created an AsterixDB instance following the instructions
at [Installing AsterixDB Using Managix](../install.html). _For constructing an example, we
assume a single machine setup.._
 
diff --git a/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/provider/DatasourceFactoryProvider.java
b/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/provider/DatasourceFactoryProvider.java
index 0f24f91..bd50c39 100644
--- a/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/provider/DatasourceFactoryProvider.java
+++ b/asterixdb/asterix-external-data/src/main/java/org/apache/asterix/external/provider/DatasourceFactoryProvider.java
@@ -29,6 +29,7 @@
 import org.apache.asterix.external.input.record.reader.RecordWithPKTestReaderFactory;
 import org.apache.asterix.external.input.record.reader.kv.KVReaderFactory;
 import org.apache.asterix.external.input.record.reader.kv.KVTestReaderFactory;
+import org.apache.asterix.external.input.record.reader.rss.RSSRecordReaderFactory;
 import org.apache.asterix.external.input.record.reader.stream.StreamRecordReaderFactory;
 import org.apache.asterix.external.input.record.reader.twitter.TwitterRecordReaderFactory;
 import org.apache.asterix.external.input.stream.factory.LocalFSInputStreamFactory;
@@ -108,6 +109,8 @@
                 return new StreamRecordReaderFactory(new SocketServerInputStreamFactory());
             case ExternalDataConstants.STREAM_SOCKET_CLIENT:
                 return new StreamRecordReaderFactory(new SocketClientInputStreamFactory());
+            case ExternalDataConstants.READER_RSS:
+                return new RSSRecordReaderFactory();
             default:
                 throw new AsterixException("unknown record reader factory: " + reader);
         }

-- 
To view, visit https://asterix-gerrit.ics.uci.edu/802
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I2bb98477e144e78e9983d33f9dd2f89a547aeccf
Gerrit-PatchSet: 1
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: abdullah alamoudi <bamousaa@gmail.com>

Mime
View raw message