atlas-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hemanth Yamijala (JIRA)" <>
Subject [jira] [Updated] (ATLAS-415) Hive import fails when importing a table that is already imported without StorageDescriptor information
Date Mon, 04 Jan 2016 12:33:39 GMT


Hemanth Yamijala updated ATLAS-415:
    Attachment: ATLAS-415.patch

Attaching a patch for quick review.

The main fix is in using the API {{AtlasClient.updateEntity}} when we find a table is already
registered with Atlas. The rest of the changes are to assist a unit test I wrote {{HiveMetaStoreBridgeTest}}
and some refactoring.

With this patch, hive-import works for the case I described in the bug and updates the created
table properly.

Couple of points that I want to call out to discuss in review:

* This might add additional calls to the server even when there's absolutely no change to
the entity. Guess this will have a performance impact, but I am unsure how we can detect if
there's any change on the client.
* Currently, I am doing the update only for Tables. Is this needed for DB and partitions as
well? (I guess yes)

> Hive import fails when importing a table that is already imported without StorageDescriptor
> -------------------------------------------------------------------------------------------------------
>                 Key: ATLAS-415
>                 URL:
>             Project: Atlas
>          Issue Type: Bug
>            Reporter: Hemanth Yamijala
>            Assignee: Hemanth Yamijala
>         Attachments: ATLAS-415.patch
> I found this when testing patches that integrate Storm with Atlas, but guess this may
occur in other scenarios as well.
> To reproduce:
> * Run a storm topology with Atlas Hook enabled that has a HiveBolt (requires patches
for ATLAS-181 and friends).
> * Run hive-import following the above.
> The first step creates a Hive DB and table setting just the required attributes. Note
that the StorageDescriptor is an optional attribute as per the Hive DataModel now. 
> The second step fails with this exception:
> {code}
> Exception in thread "main" java.lang.NullPointerException
> 	at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.getSDForTable(
> 	at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.importTables(
> 	at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.importDatabases(
> 	at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.importHiveMetadata(
> 	at org.apache.atlas.hive.bridge.HiveMetaStoreBridge.main(
> {code}

This message was sent by Atlassian JIRA

View raw message