atlas-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jiaqi Shan (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (ATLAS-2916) Unnecessary entity update causes by different AtlasObjectId
Date Tue, 09 Oct 2018 12:27:00 GMT

     [ https://issues.apache.org/jira/browse/ATLAS-2916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jiaqi Shan updated ATLAS-2916:
------------------------------
    Description: 
    When creating a Hive process,  ColumnEntity will update with the value of TableEntity's
AtlasObjectId.  There is an example to show the differences bwtween CurrentEntity and EntityInStore.
{panel:title=CurrentEntity}
AtlasEntity{AtlasStruct{typeName='hive_column', attributes=[owner:bi_sh, qualifiedName:bi.yingbang.t@bdyf,
name:t, comment:t, position:1, type:int, table:{color:#d04437}AtlasObjectId\{guid='-17020754238878791',
typeName='hive_table', uniqueAttributes={qualifiedName:b.yingbang@bdyf}}{color}]}guid='431c8847-8fd2-454d-b77a-19aeef0d6b9b',
status=null, createdBy='null', updatedBy='null', createTime=null, updateTime=null, version=0,
relationshipAttributes=[], classifications=[], meanings=[]}
{panel}
{panel:title=EntityInStore}
AtlasEntity{AtlasStruct{typeName='hive_column', attributes=[owner:bi_sh, qualifiedName:bi.yingbang.t@bdyf,
name:t, description:null, comment:t, position:1, type:int, {color:#d04437}table:AtlasObjectId\{guid='da35aff2-9851-499d-99cf-f1fbafb6e92b',
typeName='hive_table', uniqueAttributes={}}{color}]}guid='431c8847-8fd2-454d-b77a-19aeef0d6b9b',
status=ACTIVE, createdBy='bi_sh', updatedBy='bi_sh', createTime=2018-10-09T11:26:51.685Z,
updateTime=2018-10-09T11:26:51.685Z, version=0, relationshipAttributes=[], classifications=[],
meanings=[]}
{panel}
     Actually there is no metadata changed in ColumnEntity, the difference of table's
AtlasObjectId is caused by Hive Hook setting a new guid for TableEntity. So maybe it's not
necessary to update Hive column entity in this instance.

     We propose to add a LRU cache to skip updating the same entitiy which was sent in
an earlier notification. But in situation deleting and re-creating the entity with the same uniqueAttributes,
this solution goes wrong.

     Is there any other good solution to aviod this problem?

 

 

  was:
    When creating a Hive process,  Hive column entity will update with the value of table's
AtlasObjectId.  There is an example to show the differences bwtween CurrentEntity and EntityInStore.
{panel:title=CurrentEntity}
AtlasEntity{AtlasStruct{typeName='hive_column', attributes=[owner:bi_sh, qualifiedName:bi.yingbang.t@bdyf,
name:t, comment:t, position:1, type:int, table:{color:#d04437}AtlasObjectId\{guid='-17020754238878791',
typeName='hive_table', uniqueAttributes={qualifiedName:b.yingbang@bdyf}}{color}]}guid='431c8847-8fd2-454d-b77a-19aeef0d6b9b',
status=null, createdBy='null', updatedBy='null', createTime=null, updateTime=null, version=0,
relationshipAttributes=[], classifications=[], meanings=[]}
{panel}
{panel:title=EntityInStore}
AtlasEntity{AtlasStruct{typeName='hive_column', attributes=[owner:bi_sh, qualifiedName:bi.yingbang.t@bdyf,
name:t, description:null, comment:t, position:1, type:int, {color:#d04437}table:AtlasObjectId\{guid='da35aff2-9851-499d-99cf-f1fbafb6e92b',
typeName='hive_table', uniqueAttributes={}}{color}]}guid='431c8847-8fd2-454d-b77a-19aeef0d6b9b',
status=ACTIVE, createdBy='bi_sh', updatedBy='bi_sh', createTime=2018-10-09T11:26:51.685Z,
updateTime=2018-10-09T11:26:51.685Z, version=0, relationshipAttributes=[], classifications=[],
meanings=[]}
{panel}
{{  Actually there is no metedata changed in columnEntity, so I think maybe it's unnecessary
to update columnEntity if only need to update the AtlasObJectId.}}

 {{  We propose to add a LRU cache to skip updating the same entitiy which was sent in an
earlier notification and it works.}}{{Is there any other good solution to aviod this problem?}}

 


> Unnecessary entity update causes by different AtlasObjectId
> -----------------------------------------------------------
>
>                 Key: ATLAS-2916
>                 URL: https://issues.apache.org/jira/browse/ATLAS-2916
>             Project: Atlas
>          Issue Type: Improvement
>          Components:  atlas-core
>    Affects Versions: 1.0.0
>            Reporter: Jiaqi Shan
>            Priority: Minor
>
>     When creating a Hive process,  ColumnEntity will update with the value of TableEntity's
AtlasObjectId.  There is an example to show the differences bwtween CurrentEntity and EntityInStore.
> {panel:title=CurrentEntity}
> AtlasEntity{AtlasStruct{typeName='hive_column', attributes=[owner:bi_sh, qualifiedName:bi.yingbang.t@bdyf,
name:t, comment:t, position:1, type:int, table:{color:#d04437}AtlasObjectId\{guid='-17020754238878791',
typeName='hive_table', uniqueAttributes={qualifiedName:b.yingbang@bdyf}}{color}]}guid='431c8847-8fd2-454d-b77a-19aeef0d6b9b',
status=null, createdBy='null', updatedBy='null', createTime=null, updateTime=null, version=0,
relationshipAttributes=[], classifications=[], meanings=[]}
> {panel}
> {panel:title=EntityInStore}
> AtlasEntity{AtlasStruct{typeName='hive_column', attributes=[owner:bi_sh, qualifiedName:bi.yingbang.t@bdyf,
name:t, description:null, comment:t, position:1, type:int, {color:#d04437}table:AtlasObjectId\{guid='da35aff2-9851-499d-99cf-f1fbafb6e92b',
typeName='hive_table', uniqueAttributes={}}{color}]}guid='431c8847-8fd2-454d-b77a-19aeef0d6b9b',
status=ACTIVE, createdBy='bi_sh', updatedBy='bi_sh', createTime=2018-10-09T11:26:51.685Z,
updateTime=2018-10-09T11:26:51.685Z, version=0, relationshipAttributes=[], classifications=[],
meanings=[]}
> {panel}
>      Actually there is no metadata changed in ColumnEntity, the difference of table's
AtlasObjectId is caused by Hive Hook setting a new guid for TableEntity. So maybe it's not
necessary to update Hive column entity in this instance.
>      We propose to add a LRU cache to skip updating the same entitiy which was sent
in an earlier notification. But in situation deleting and re-creating the entity with the
same uniqueAttributes, this solution goes wrong.
>      Is there any other good solution to aviod this problem?
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message