Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 0F18A200D14 for ; Tue, 19 Sep 2017 07:14:45 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 0D8F41609DE; Tue, 19 Sep 2017 05:14:45 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 525B21609DB for ; Tue, 19 Sep 2017 07:14:44 +0200 (CEST) Received: (qmail 53725 invoked by uid 500); 19 Sep 2017 05:14:43 -0000 Mailing-List: contact dev-help@atlas.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@atlas.apache.org Delivered-To: mailing list dev@atlas.apache.org Received: (qmail 53711 invoked by uid 99); 19 Sep 2017 05:14:43 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 19 Sep 2017 05:14:43 +0000 Received: from mail-vk0-f50.google.com (mail-vk0-f50.google.com [209.85.213.50]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id 671F41A00CA for ; Tue, 19 Sep 2017 05:14:42 +0000 (UTC) Received: by mail-vk0-f50.google.com with SMTP id p204so1299331vkp.7 for ; Mon, 18 Sep 2017 22:14:42 -0700 (PDT) X-Gm-Message-State: AHPjjUiPCKWEKHGmrCo3BpjPWlo7FzPBbrtiAY9QEie0FCrk8zsNVnAo 5Uz2G/SfaOuJqViNx+p/WfumMICvRclm7yM6m0s= X-Google-Smtp-Source: AOwi7QCC1K5pTkCQL8lMlshKrTTmc8FtJsxYpC4JTibz/0NBix8tPlgL0Lx/zKP0gFwjtv24r2eyFuzwqxZNYjWu0/A= X-Received: by 10.31.7.142 with SMTP id 136mr146289vkh.10.1505798080931; Mon, 18 Sep 2017 22:14:40 -0700 (PDT) MIME-Version: 1.0 Received: by 10.103.112.198 with HTTP; Mon, 18 Sep 2017 22:14:40 -0700 (PDT) In-Reply-To: References: From: Sarath Subramanian Date: Mon, 18 Sep 2017 22:14:40 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: Tag propagation and classification entityTypes To: dev@atlas.apache.org Cc: Madhan Neethiraj Content-Type: multipart/alternative; boundary="001a1143d50cb7adbb055983efac" archived-at: Tue, 19 Sep 2017 05:14:45 -0000 --001a1143d50cb7adbb055983efac Content-Type: text/plain; charset="UTF-8" David, The current tag propagation code in review doesn't account for inherited tags (from super types), this was done intentionally as this is being handled by entity notification listener. When a tag is added/deleted/updated on an entity we send notification messages which includes the tag's super type information and a list of impacted entities (by tag propagation). The inherited tag information are computed from typeRegistry (cache of the Atlas type system). I feel this will be efficient than traversing the graph to get the inherited information. Atlas relies on typeRegistry cache to resolve all super type, sub type information and for each type addition/deletion the cache is refreshed to contain the latest type information. The effort you mentioned in *step 4* might be duplicate of what we currently have in type registry. Regarding entity-classification restrictions on propagated tags, I think this should not be part of the propagation query and we should restrict this using relationship property in the edge - 1. To allow only certain tags for propagation 2. To allow tags of only certain parent tags. 3. or any other tag propagation overrides/restrictions on entities This offers more flexibility to add constraints on tag propagation than including in graph query. Once we move to Janus graph we can tweak the query using TP3 syntax. Thanks, Sarath Subramanian On Fri, Sep 15, 2017 at 1:25 AM, David Radley wrote: > Hi Madhan and Sarath, > It occurs to me that we are introducing 2 new definitions around > classifications that require the code to traverse around the graph. > - classificationDefs now have entityTypes to restrict the entities that > they can be applied to. This requires us to check entity and > classification hierarchies to ensure that inherited entities and > classifications abide by these restrictions. > This is currently done in code in the AtlasClassificationType. One set of > checks at classification add / update time and another when we try to add > a classification to an entity. > - tag propagation implementation is currently in review and looks to work > out where tags should be propagated to using Gremlin TP2 queries. The > current proposed query is neat around 10 lines long, but does not account > for inheritance or entityType restrictions. > > If we carry on with the current approach , we potentially need to > implement checking down the graph in the type code and also in the Gremlin > query. I wonder if we can have a consistent approach so we use gremlin > queries in both scenarios or use code in both scenarios. I see a few > options > > 1) Carry on as is , code for Classification entityTypes , TP2 query for > tag propagation. The TP2 query may become much more complex as it will > need to recurse around the classification types in the graph and the > entity types in the graph as well as the instance graph. The entityTypes > gremlin logic will need to match the entityTypes checking code logic. > 2) Move all the logic to code, this should mean we work at TP3, may give > us more flexibility to handle tag propagation overrides we will need at a > later date > 3) Move all navigation logic to gremlin queries, this is appealing as the > graph engine then can optimize the queries. > 4) Extend 3) to store (cache) some of the inherited states in the instance > graph so a simpler query can be made. We could also extend this approach > to store when a user overrides the default propagation. I know we have > concerns with duplicating metadata. I wonder if we could split the > properties in the vertices so there is a defined section and a derived / > cached section, so it is obvious which properties might need > re-calculating. > > Thoughts? > all the best, David. > > > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales with number > 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU > --001a1143d50cb7adbb055983efac--