atlas-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hemanth Yamijala (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (ATLAS-415) Hive import fails when importing a table that is already imported without StorageDescriptor information
Date Mon, 04 Jan 2016 12:33:39 GMT

     [ https://issues.apache.org/jira/browse/ATLAS-415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Hemanth Yamijala updated ATLAS-415:
-----------------------------------
    Attachment: ATLAS-415.patch

Attaching a patch for quick review.

The main fix is in using the API {{AtlasClient.updateEntity}} when we find a table is already
registered with Atlas. The rest of the changes are to assist a unit test I wrote {{HiveMetaStoreBridgeTest}}
and some refactoring.

With this patch, hive-import works for the case I described in the bug and updates the created
table properly.

Couple of points that I want to call out to discuss in review:

* This might add additional calls to the server even when there's absolutely no change to
the entity. Guess this will have a performance impact, but I am unsure how we can detect if
there's any change on the client.
* Currently, I am doing the update only for Tables. Is this needed for DB and partitions as
well? (I guess yes)

> Hive import fails when importing a table that is already imported without StorageDescriptor
information
> -------------------------------------------------------------------------------------------------------
>
>                 Key: ATLAS-415
>                 URL: https://issues.apache.org/jira/browse/ATLAS-415
>             Project: Atlas
>          Issue Type: Bug
>            Reporter: Hemanth Yamijala
>            Assignee: Hemanth Yamijala
>         Attachments: ATLAS-415.patch
>
>
> I found this when testing patches that integrate Storm with Atlas, but guess this may
occur in other scenarios as well.
> To reproduce:
> * Run a storm topology with Atlas Hook enabled that has a HiveBolt (requires patches
for ATLAS-181 and friends).
> * Run hive-import following the above.
> The first step creates a Hive DB and table setting just the required attributes. Note
that the StorageDescriptor is an optional attribute as per the Hive DataModel now. 
> The second step fails with this exception:
> {code}
> Exception in thread "main" java.lang.NullPointerException
> 	at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.getSDForTable(HiveMetaStoreBridge.java:345)
> 	at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.importTables(HiveMetaStoreBridge.java:219)
> 	at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.importDatabases(HiveMetaStoreBridge.java:104)
> 	at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.importHiveMetadata(HiveMetaStoreBridge.java:96)
> 	at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.main(HiveMetaStoreBridge.java:503)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message