atlas-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Suma Shivaprasad (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (ATLAS-122) Support for Deletion of Entities
Date Tue, 25 Aug 2015 14:42:46 GMT

    [ https://issues.apache.org/jira/browse/ATLAS-122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14711368#comment-14711368
] 

Suma Shivaprasad edited comment on ATLAS-122 at 8/25/15 2:42 PM:
-----------------------------------------------------------------

Deletion of entities raises some interesting scenarios like

1. If a hive_database is requested to be deleted, should we support deletion in the case where
there are still tables in the model referring to it ? Or should we mandate the user to delete
the tables first and then delete the database? So to generalize, if an entity has incoming
edges, then we should throw an error saying other entities are dependent on this and hence
cannot be deleted. If we dont throw an error , then it leads to challenges like "should we
delete the database recursively along with the the tables that refer to it.  To what level/depth
of nesting should we go. What if there are other entities like a process referring to the
tables, for eg: hive_process, should we delete that process as well? We might lose history/version
info if we delete it.

2. If an entity has outgoing edges, for eg: hive_tables has outgoing edges to a  list of columns,
can we generalize that these referred entities will also be deleted if they have no other
incoming edges other than the current entity being deleted? However this fails when there
are outgoing lineage relationship edges that point to other tables. For eg: a hive_process
has outgoing edges to input and output tables. So when a delete is requested for a "hive_process/query"
, then deleting the tables that it refers to doesnt make much sense even though there are
no references to those tables from other processes.


[~svenkat] Thoughts?




was (Author: suma.shivaprasad):
Deletion of entities raises some interesting scenarios like

1. If a hive_database is requested to be deleted, should we support deletion in the case where
there are still tables in the model referring to it ? Or should we mandate the user to delete
the tables first and then delete the database? So to generalize, if an entity has incoming
edges, then we should throw an error saying other entities are dependent on this and hence
cannot be deleted. If we dont throw an error , then it leads to challenges like "should we
delete the database recursively along with the the tables that refer to it.  To what level/depth
of nesting should we go. What if there are other entities like a process referring to the
tables, for eg: hive_process, should we delete that process as well? We might lose history/version
info if we delete it.

2. If an entity has outgoing edges, for eg: hive_tables has outgoing edges to a  list of columns,
can we generalize that these referred entities will also be deleted if they have no other
incoming edges other than the current entity being deleted? However this fails when there
are outgoing lineage relationship edges point to other tables. For eg: a hive_process has
outgoing edges to input and output tables. So when a delete is requested for a "hive_process/query"
, then deleting the tables that it refers to doesnt make much sense even though there are
no refernces to those tables from other processes.


[~svenkat] Thoughts?



> Support for Deletion of Entities
> --------------------------------
>
>                 Key: ATLAS-122
>                 URL: https://issues.apache.org/jira/browse/ATLAS-122
>             Project: Atlas
>          Issue Type: New Feature
>            Reporter: Suma Shivaprasad
>            Assignee: Suma Shivaprasad
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message