atlas-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Suma Shivaprasad (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ATLAS-182) Add data model for Storm topology elements
Date Mon, 14 Dec 2015 12:14:46 GMT

    [ https://issues.apache.org/jira/browse/ATLAS-182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15055902#comment-15055902
] 

Suma Shivaprasad commented on ATLAS-182:
----------------------------------------

Initial review comments

1. pom.xml - The dependencies could be removed in storm hook pom since they are being added
by parent pom already

+        <dependency>
+            <groupId>org.slf4j</groupId>
+            <artifactId>slf4j-api</artifactId>
+        </dependency>
+
+        <dependency>
+            <groupId>org.slf4j</groupId>
+            <artifactId>slf4j-log4j12</artifactId>
+        </dependency>


2. pom.xml -  httpConnector port needs to be changed to 31000 and stop port to 310001 - Pls
refer https://github.com/apache/incubator-atlas/blob/master/webapp/pom.xml

3.Whats the use of "endTime" attribute in "Topology" .  Should we removed  "endTime"  - Didnt
see it geting used anywhere?

4.Topology "id" is used to indicate a run id or instance id ? Didnt understand why we need
to capture lineage between two "DataSet"s across different runs of the same Topology?  We
could just capture it a Topology level and leave out the "instance" id?

5. Is there anything in the Topology conf that is of interest/searchable since "conf" ~ (map(string,
string), optional) - could be huge for a Storm topology? 

6. "name" attribute could be removed in KAFKA, HBase, HDFS and JMS Data Set since its already
part of "DataSet"

7. Need to document that Hive Data Model needs to be created before the Storm Data Model

8. JMS_TOPIC can be removed since its not getting used in the Hook ?

9. HBASE_TABLE and HDFS_DATA_SET could be renamed to STORM_SINK_HBASE_TABLE and STORM_SINK_HDFS_PATH
since we are planning to have a generic model for these anyways and will conflict with the
names and maybe the model also then? This will also need a migration story when we have the
generic models.

10. We should also add "clusterName" to  kafka topic, hdfs and hbase path


> Add data model for Storm topology elements
> ------------------------------------------
>
>                 Key: ATLAS-182
>                 URL: https://issues.apache.org/jira/browse/ATLAS-182
>             Project: Atlas
>          Issue Type: Sub-task
>    Affects Versions: 0.6-incubating
>            Reporter: Venkatesh Seetharam
>            Assignee: Venkatesh Seetharam
>             Fix For: 0.6-incubating
>
>         Attachments: ATLAS-182-v1.patch, ATLAS-182.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message