Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id EA034200C6F for ; Tue, 9 May 2017 10:13:14 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id E8A17160BC3; Tue, 9 May 2017 08:13:14 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 42A82160BB3 for ; Tue, 9 May 2017 10:13:14 +0200 (CEST) Received: (qmail 31242 invoked by uid 500); 9 May 2017 08:13:08 -0000 Mailing-List: contact dev-help@atlas.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@atlas.incubator.apache.org Delivered-To: mailing list dev@atlas.incubator.apache.org Received: (qmail 31221 invoked by uid 99); 9 May 2017 08:13:08 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 09 May 2017 08:13:08 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 012EAC14D0 for ; Tue, 9 May 2017 08:13:08 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -100.002 X-Spam-Level: X-Spam-Status: No, score=-100.002 tagged_above=-999 required=6.31 tests=[RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id OLtoZOryvN0F for ; Tue, 9 May 2017 08:13:06 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 9CEAA5FC85 for ; Tue, 9 May 2017 08:13:05 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 1BB58E0D55 for ; Tue, 9 May 2017 08:13:05 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 8D31021E00 for ; Tue, 9 May 2017 08:13:04 +0000 (UTC) Date: Tue, 9 May 2017 08:13:04 +0000 (UTC) From: "David Radley (JIRA)" To: dev@atlas.incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (ATLAS-1690) Introduce top level relationships MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Tue, 09 May 2017 08:13:15 -0000 [ https://issues.apache.org/jira/browse/ATLAS-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16002261#comment-16002261 ] David Radley commented on ATLAS-1690: ------------------------------------- Hi [~madhan.neethiraj] Great feedback - thanks for your thoughtful and open response. I will change the array to and endpoint1 and 2. I think that is clearer. I am keen that we propagate tags, this is very powerful. I thought I would explain how we could do classifications and then see how this option fits in. The classifications our customers are working with include confidentiality. The confidentiality scheme might have C1,C2, C3 and C4 levels. C1 might be public and C3 top secret. Different companies name these Cn levels differently. But in all these cases there is an order, C4 being the highest classification level. Though it is possible to have more complex classifications schemes- many / most cases can work with this list sort of classification schema. A particular glossary term or column might be associated with one of these classifications. We were thinking that the Classification levels (C1, C2 etc) would be new system Types (I suggest an EntityDef) A classification use is the relationship between the classification level and the thing it is classifying. By default the classification will be that from the level , but a rule can run and increase the classification; this would be calculated (and stored?) in the relationship instance. So to address your proposal: - I think the propagated classifications would be derived at query time and could be useful -do we need an effective classification? I am not convinced with the proposed mechanism. - Your example around tables and columns and PII assumes that PII is a binary flag (or one tag), I am suggesting that this is not the way that classifications are normally implemented - these should be an ordered list of levels. I see in some of your recent demos you use v1 terms to implement these classification levels for this classification ordering. If a table is classified as public and has a PII column , we would not want the public classification to override the PII column. As a query brings together 2 public fields, line name and salary, the combination becomes PII, in this case we need the rule to drive this. This implementation would encourage the use of bidirectional relationships to be implemented purely to propagate tags. I suggest many propagations would not be on one relationship, but could flow much further would be to all has-a terms - following all the has-a links. I am also concerned that the role who authors the relationship is not the right role to make classification propagation decisions. I wonder whether a smarter approach would be to tag the relationship as "propagate-1-to-2" (hopefully something more meaningful like propogate-table-to-column") and Ranger picks up this hint. Ranger could decide to run a simple rule of propagating all the tags from 1 to 2 or a more complex rule taking other conditions into account. I suggest that we explicitly implement these classification levels and uses, I hope there is a simple case where there are some classifications that should be propagated for all consumer cases, and rules can run to override the classifications and we can find a way of doing this using a governance role and we could make this work. Maybe some supplied Ranger rules and tags that Atlas used out of the box. GDPR rules and tags would be a good use case here. > Introduce top level relationships > --------------------------------- > > Key: ATLAS-1690 > URL: https://issues.apache.org/jira/browse/ATLAS-1690 > Project: Atlas > Issue Type: Improvement > Reporter: David Radley > Assignee: David Radley > Labels: VirtualDataConnector > Attachments: Atlas_RelationDef_Json_Structure_v1.pdf, Atlas Relationships proposal v1.0.pdf, Atlas Relationships proposal v1.1.pdf, Atlas Relationships proposal v1.2.pdf, Atlas Relationships proposal v1.3.pdf, Atlas Relationships proposal v1.4.pdf, Atlas Relationships proposal v1.5.pdf, Atlas Relationships proposal v1.6.pdf, Atlas Relationships proposal v1.7.pdf > > > Introduce top level relationships including support for > -many to many relationships > - relationship names including the name for both ends and the relationship. -- This message was sent by Atlassian JIRA (v6.3.15#6346)