atlas-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Radley <david_rad...@uk.ibm.com>
Subject Re: Loops in V2 API
Date Fri, 27 Jan 2017 14:05:50 GMT
Hi Cassio,
Thank you for your feedback.

I have searched in the code base around the V2 API and cannot see a base 
attribute in Atlas that can play the role of business identity, what were 
you think of ? 

It is good call to bring to my attention the DSL, I notice the search 
support for V2 went in this month in Jira 1308. I will look into this 
further as part of this code change.

    all the best, David. 






From:   "Cassio Dos Santos" <scdos@us.ibm.com>
To:     dev@atlas.incubator.apache.org
Date:   26/01/2017 15:49
Subject:        Re: Loops in V2 API




Hi David,

A couple of questions: 

Don't we already have a base attribute in Atlas that can play the role of 
business identity?
As you promote references to entities, would that require changes to DSL, 
or, as the Atlas model gets closer to the underlying graph model, would we 
expect users to rely more on Gremlin?

Cassio


Srikanth Venkat ---01/26/2017 09:47:20 AM---Hi David, Thanks for your 
input. IMHO, think this would be a good addition so that community can lev

From: Srikanth Venkat <svenkat@hortonworks.com>
To: "dev@atlas.incubator.apache.org" <dev@atlas.incubator.apache.org>
Date: 01/26/2017 09:47 AM
Subject: Re: Loops in V2 API



Hi David,

Thanks for your input. IMHO, think this would be a good addition so that 
community can leverage the relationships aspects.

Srikanth Venkat 
Senior Director, Product Management 
Hortonworks Inc.
svenkat@hortonworks.com |  +1 510 394 0497 (O) 

On 1/26/17, 3:18 AM, "David Radley" <david_radley@uk.ibm.com> wrote:

   Dear all, 
   I would like to address these loops. Is the community supportive of me 
   coding relationships as  AtlasObjectId (Madhan's idea) and adding in an 

   optional refLabel? 
 
   I will raise Jiras to track the displaytext enhancement around 
   globalization and a separate enhancement to introduce explicit 
   identifiers. 
 
      many thanks, David. 
 
 
   ----- Forwarded by David Radley/UK/IBM on 26/01/2017 11:06 -----
 
   From:   David Radley/UK/IBM
   To:     dev@atlas.incubator.apache.org
   Date:   24/01/2017 10:36
   Subject:        Fw: Loops in V2 API.
 
 
   Hi,
   Responding to Madhan:
   - I think there is a need for a new piece of text to specify the 
   relationship label - the uml association label that Mandy talks of. The 

   label I am thinking of here is part of the  logical model. 
   - in terms of displayText for an entity. 
           - This appears to me to be a view rather than a model concept 
           - displayText normally is globalized. 
           - this brings us onto thinking about how we identify an entity- 
I 
   suspect what we want to to display would be a useful entity identifier. 
. 
                   - if there is an obvious attribute that is the 
identifier 
   then your approach would be useful. For example the primary key of a 
RDB 
   asset.
                   - we have found while working with master data 
management, 
   that is useful to have business keys, these are one or more 
concatenated 
   attributes that                 are useful to a business user to 
identify 
   an entity. I think that introducing business keys into terms and 
entities 
   is a flexible way of dealing with this issue. For               example 

   identifying a person with the national insurance number and first and 
last 
   name - could produce a meaningful label like this "NM111333444555-David 

   Radley. 
 
   So in summary I like the idea, I would split it into: 
           1) introduce business identifiers into entities and terms 
logical 
   model. I think this means attributes could be identifiers 
(isIdentifier) 
   and allowing an attribute to be specified as an identifier and be 
composed 
   of a list of other attributes. 
           2) introduce globalization of displayText for terms, entities 
and 
   attributes. 
           3) enhance the relationships to include an identifier field(s) 
   rather than an arbitrary field. 
 
   It will be quite difficult to add in identifier support as more people 
use 
   the V2 API- it would be good to add it early. 
 
   Am I making sense? 
 
   Responding to Mandy:
   I was thinking about how we could combine the need for reverse pointers 

   always being there, but only sometimes having the need to name a 
   relationship . 
   In many cases we want to specify a direction to a relationship, but 
also 
   be able to navigate it backwards.
 
   I think having relationships as top level objects in the type system 
would 
   work; it would allow us to manage relationships with properties in a 
   standard entity manner. 
   I guess we would need to prevent relationships from having 
relationships.
 
   I wonder what you think of embedding the relationship definition in the 

   source object (as the current Atlas does in the constraintDef) and 
allow 
   it to be found in the target object. 
 
   I think having the constraintDefs along the lines of what Madhan and I 
   suggested would be a way to optionally specify the reverse attribute 
name 
   and the association name. 
   In order see the reverse relationship - I wonder if we had a section in 

   the entity called "inbound relationships". We could easily list the 
   inbound relationships, inside the API by looking for IN edges in the 
graph 
 
 
   For example TypeA could have constraintdefs on the relationship and 
TypeB 
   does not. This allows us to navigate back any inbound relaitonship - 
but 
   not have to model this unless we need to add labels. Something like : 
 
   "inboundRelationships": [
                               {
                                   "type": "TypeA",
                                   "refAttribute": "children",
                                   attributeName : "parent",
                                           "refguid" : "101010104444",
                                           "label" : "cares for" 
                               },
                                    {
                                   "type": "TypeB",
                                   "refAttribute": "friend", 
                                        "refguid" : "101010104444" 
                                   }
                               }, 
                                   ]
 
   To add extra properties could be done by pointing to an entity with the 

   new properties. This would be the association class in UML. 
 
   I think I may have missed some of the additional advantages you see of 
   having a top level relationship object. 
 
   How does this sound? 
 
   all the best, David. 
 
 
   ----- Forwarded by David Radley/UK/IBM on 24/01/2017 09:31 -----
 
   From:   Mandy Chessell/UK/IBM@IBMGB
   To:     dev@atlas.incubator.apache.org
   Date:   24/01/2017 08:36
   Subject:        Re: Loops in V2 API.
 
 
 
   Dear All,
   When I was talking to David last week about this, we could not think of 
an 
 
   example where the relationship between metadata entities was one way. 
The 
 
   reason is that metadata relationships are connecting different 
   perspectives.  Value comes from understanding how each perspective 
   connects to the rest of the world.  So for example, if we take the idea 
of 
 
   a business term entity linked to a data field entity contained in a 
data 
   set entity then the data set owner wants to understand the meaning of 
the 
   data field (navigation from data field to business term) and the 
subject 
   area owner wants to know where data of a particular meaning is stored 
   (navigation from business term to data field).
 
   So perhaps the default in the type language - and through to the atlas 
   implementation - should be that all relationships between entities are 
   two-way - with 3 labels:
   The relationship name
   The names of the relationship as viewed from each end of the 
relationship
 
   We see all three lables in UML.  UML relationships are also two-way by 
   default and you add constraints to make them one-way.
 
   This would suggest the type language should define relationships as 
well 
   ans entities as top level objects in the type language.  The 
relationship 
   would have a name and declare which entities it connects to.  The 
entities 
 
   would reference the relationship and provide its local name for the 
   relationship.
 
   This way, the type language will then allow for an extension where 
   relationships have properties.  This capability is supported natively 
in 
   the grpah and would enable richer information gathering on the 
   relationships between entities (one of the key values of an integrate 
   metadata repository).
 
   All the best
   Mandy
   ___________________________________________
   Mandy Chessell CBE FREng CEng FBCS
   IBM Distinguished Engineer
   IBM Analytics Group CTO Office
 
   Master Inventor
   Member of the IBM Academy of Technology
   Visiting Professor, Department of Computer Science, University of 
   Sheffield
 
   Email: mandy_chessell@uk.ibm.com
   LinkedIn: http://www.linkedin.com/pub/mandy-chessell/22/897/a49
 
   Assistant: Janet Brooks - jsbrooks12@uk.ibm.com
 
 
 
   From:   Madhan Neethiraj <madhan@apache.org>
   To:     "dev@atlas.incubator.apache.org" 
<dev@atlas.incubator.apache.org>
   Date:   24/01/2017 06:57
   Subject:        Re: Loops in V2 API.
   Sent by:        Madhan Neethiraj <mneethiraj@hortonworks.com>
 
 
 
   David,
 
   Idea of ‘refLabel’ sounds good; it will be helpful to include such an 
   attribute value in addition to typeName and guid. I think this idea 
will 
   be useful to the type in general, not just in constraints; how about 
the 
   ability to designate one of the entity-attributes as ‘displayText’; 
this 
   attribute value can be used many places, like search results. What do 
you 
   think?
 
   Thanks,
   Madhan
 
 
 
   On 1/18/17, 4:41 AM, "David Radley" <david_radley@uk.ibm.com> wrote:
 
       Hi Madhan,
       On further reflection, a more generic mechanism to get the name 
might 
   be 
       to enhance the constraint to add a refLabel to the params which 
could 
       specify the label field as being "name", like this:
 
       "constraintDefs": [
                               {
                                   "type": "mappedFromRef",
                                   "params": {
                                       "refAttribute": "TestType2",
                              "refLabel ": "name" 
                                   }
                               } 
 
       What do you think?
 
       Thanks David.
 
 
       ----- Forwarded by David Radley/UK/IBM on 18/01/2017 12:36 -----
 
       From:   David Radley/UK/IBM
       To:     Madhan Neethiraj <madhan@apache.org>
       Cc:     <dev@atlas.incubator.apache.org>
       Date:   18/01/2017 10:22
       Subject:        Fw: [jira] David Radley mentioned you (JIRA)
 
 
       Hi Madhan,
       I like the idea of using AtlasObjectId. Can I suggest that if we 
have 
   a 
       single hashcode of it in the serialized form , then we cannot 
easily 
   see 
       that a reference points to a particular object's guid. Maybe the 
Java 
   API 
       reference should serialise to a nested structure containing the 
type 
   and 
       guid. For readability / debugging I suggest we also add name into 
       AtlasObjectId (this approach proved very useful in ATLAS-1186). In 
   this 
       way a reference contains the type and name (understandable by the 
   human 
       reader, but not unique) and the guid (not understandable by the 
human 
       reader but useful for unique identification of objects and 
   construction of 
       unambiguous object references). 
 
       So the children and parent references in json would look something 
   like :
 
 
       "children": [
                                        {
                                            "type": "TestType2",
                                                   "name": "child1",
                                                   "guid": 
“1234-5678-90123”
                                        },
                                               {
                                            "type": "TestType2",
                                                   "name": "child2",
                                                   "guid": 
“1234-5678-90124”
                                        },
                  ]
       "parent": 
                                        {
                                            "type": "TestType1",
                                                   "name": "parent1",
                                                   "guid": 
“1234-5678-90123”
                                        }
 
 
       We would need something equivalent in toString(). 
 
 
       What do you think?
 
          Thanks, David. 
       ----- Forwarded by David Radley/UK/IBM on 18/01/2017 09:45 -----
 
       From:   Madhan Neethiraj <madhan@apache.org>
       To:     David Radley/UK/IBM@IBMGB
       Date:   18/01/2017 01:47
       Subject:        Re: [jira] David Radley mentioned you (JIRA)
       Sent by:        Madhan Neethiraj <mneethiraj@hortonworks.com>
 
 
 
       David,
 
       Thanks for the type-def JSONs and the steps to reproduce the issue. 

   The 
       implementation should be updated to not get into such infinite 
loops. 
   I 
       guess one approach would be to treat references to other entities 
as a 
 
 
       AtlasObjectId, even when the reference points to a full entity. For 

       example:
       - toString() should print AtlasObjectId equivalent of the 
referenced 
       object. i.e. { “typeName”: “TestType1”, “guid”: “1234-5678-90123” }
       - hashCode() should use (new AtlasObjectId(“TestType1”, 
       “1234-5678-90123”)).hashCode(), instead of calling child.hashCode() 
or 
 
 
       parent.hashCode()
 
       What do you think?
 
       Thanks,
       Madhan
 
 
 
 
       On 1/17/17, 7:11 AM, "David Radley (JIRA)" <jira@apache.org> wrote:
 
 
                [ 
 
   
https://issues.apache.org/jira/browse/ATLAS-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel


 
 
       ]
 
           David Radley mentioned you on ATLAS-1458
           --------------------------------
 
           [~madhan.neethiraj]
           As requested on the dev list, here are json files defining the 
   types 
       that I used to recreate the loop.
 
           The types create ok. To ge the loop using the  01-test.json, I 
do 
   the 
       following
           create an entity of TestType1 EntityA
           create an entity of TestType2 EntityB
           I update EntityA to have EntityB as a child.
           I update EntityB to have EntityB as a parent. It loops during 
this 
 
 
       update.
 
           I get the same loop for 02-test.json and 03-test.json. 
 
           >                 Key: ATLAS-1458
 
           >         View Online: 
       https://issues.apache.org/jira/browse/ATLAS-1458
           >         Add Comment: 
       https://issues.apache.org/jira/browse/ATLAS-1458#add-comment
 
           Hint: You can mention someone in an issue description or 
comment 
   by 
       typing  "@" in front of their username.
 
 
 
 
           --
           This message was sent by Atlassian JIRA
           (v6.3.4#6332)
 
 
 
 
 
 
       Unless stated otherwise above:
       IBM United Kingdom Limited - Registered in England and Wales with 
   number 
       741598. 
       Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire 
PO6 
 
   3AU
 
 
       Unless stated otherwise above:
       IBM United Kingdom Limited - Registered in England and Wales with 
   number 
       741598. 
       Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire 
PO6 
 
   3AU
 
 
 
 
 
 
 
 
 
   Unless stated otherwise above:
   IBM United Kingdom Limited - Registered in England and Wales with 
number 
   741598. 
   Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 
3AU
 
 
   Unless stated otherwise above:
   IBM United Kingdom Limited - Registered in England and Wales with 
number 
   741598. 
   Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 
3AU
 
 





Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 
741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message