atlas-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Radley (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ATLAS-1690) Introduce top level relationships
Date Tue, 09 May 2017 08:13:04 GMT

    [ https://issues.apache.org/jira/browse/ATLAS-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16002261#comment-16002261
] 

David Radley commented on ATLAS-1690:
-------------------------------------

Hi [~madhan.neethiraj] 
Great feedback - thanks for your thoughtful and open response.
I will change the array to and endpoint1 and 2. I think that is clearer.

I am keen that we propagate tags, this is very powerful. 
I thought I would explain how we could do classifications and then see how this option fits
in.
The classifications our customers are working with include confidentiality. The confidentiality
scheme might have C1,C2, C3 and C4 levels. C1 might be public and C3 top secret. Different
companies name these Cn levels differently. But in all these cases there is an order, C4 being
the highest classification level. Though it is possible to have more complex classifications
schemes- many / most cases can work with this list sort of classification schema. A particular
glossary term or column might be associated with one of these classifications. 
We were thinking that the Classification levels (C1, C2 etc) would be new system Types (I
suggest an EntityDef) A classification use is the relationship between the classification
level and the thing it is classifying. By default the classification will be that from the
level , but a rule can run and increase the classification; this would be calculated (and
stored?) in the relationship instance.      

So to address your proposal:
- I think the propagated classifications would be derived at query time and could be useful
-do we need an effective classification? I am not convinced with the proposed mechanism.
- Your example around tables and columns and PII assumes that PII is a binary flag (or one
tag), I am suggesting that this is not the way that classifications are normally implemented
- these should be an ordered list of levels. I see in some of your recent demos you use v1
terms to implement these classification levels for this classification ordering. If a table
is classified as public and has a PII column , we would not want the public classification
to override the PII column.  As a query brings together 2 public fields, line name and salary,
the combination becomes PII, in this case we need the rule to drive this.
This implementation would encourage the use of bidirectional relationships to be implemented
purely to propagate tags. I suggest many propagations would not be on one relationship, but
could flow much further would be to all has-a terms - following all the has-a links. 
I am also concerned that the role who authors the relationship is not the right role to make
classification propagation decisions.  

I wonder whether a smarter approach would be to tag the relationship as "propagate-1-to-2"
(hopefully something more meaningful like propogate-table-to-column")  and Ranger picks up
this hint. Ranger could decide to run a simple rule of propagating all the tags from 1 to
2 or a more complex rule taking other conditions into account.

I suggest that we explicitly implement  these classification levels and uses, I hope there
is a simple case where there are some classifications that should be propagated for all consumer
cases, and rules can run to override the classifications and we can find a way of doing this
using a governance role and we could make this work. Maybe some supplied Ranger rules and
tags that Atlas used out of the box. GDPR rules and tags would be a good use case here.



  


> Introduce top level relationships
> ---------------------------------
>
>                 Key: ATLAS-1690
>                 URL: https://issues.apache.org/jira/browse/ATLAS-1690
>             Project: Atlas
>          Issue Type: Improvement
>            Reporter: David Radley
>            Assignee: David Radley
>              Labels: VirtualDataConnector
>         Attachments: Atlas_RelationDef_Json_Structure_v1.pdf, Atlas Relationships proposal
v1.0.pdf, Atlas Relationships proposal v1.1.pdf, Atlas Relationships proposal v1.2.pdf, Atlas
Relationships proposal v1.3.pdf, Atlas Relationships proposal v1.4.pdf, Atlas Relationships
proposal v1.5.pdf, Atlas Relationships proposal v1.6.pdf, Atlas Relationships proposal v1.7.pdf
>
>
> Introduce top level relationships including support for 
> -many to many relationships
> - relationship names including the name for both ends and the relationship.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message