atlas-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Hagelberg <jnhagelb...@us.ibm.com>
Subject Re: Review Request 47810: ATLAS-694: Update Atlas to use Graph DB abstraction layer
Date Thu, 22 Sep 2016 19:02:56 GMT


> On Sept. 22, 2016, 11:12 a.m., Shwetha GS wrote:
> > catalog/pom.xml, line 67
> > <https://reviews.apache.org/r/47810/diff/12/?file=1507166#file1507166line67>
> >
> >     for now, catalog can only be used with Titan 0.5.4 - why?

A good portion of the catalog project, especially the ones in the query package, make direct
use of TP2-specific classes such as GremlinPipeline.  TP3 uses a different set of packages.
 It would take a good deal of work to refactor the catalog project to generate gremlin that
is compatible with both TP2 and TP3.  To reduce the scope, I did not tackle that in this initial
implementation.  The query generation logic here will be need to use some abstraction that
is independent of the TP version.  If done right, this could also be used to help clean up
the DSL translation logic as well (See my response to Suma's comment about that).  At this
point, I would really like to focus on getting the general framework in place, which these
changes covers.  I think things making catalog work with TP3 should be treated as a separate
JIRA.


> On Sept. 22, 2016, 11:12 a.m., Shwetha GS wrote:
> > graphdb/api/src/main/java/org/apache/atlas/repository/graphdb/AtlasPropertyKey.java,
line 33
> > <https://reviews.apache.org/r/47810/diff/12/?file=1507169#file1507169line33>
> >
> >     There is AtlasPropertyKey.Cardinality and AtlasCardinality. Whats the difference?

There isn't a difference.  I'll consolidate them.


> On Sept. 22, 2016, 11:12 a.m., Shwetha GS wrote:
> > repository/src/main/java/org/apache/atlas/discovery/graph/DefaultGraphPersistenceStrategy.java,
line 148
> > <https://reviews.apache.org/r/47810/diff/12/?file=1507184#file1507184line148>
> >
> >

Fixed to use GraphHelper.


> On Sept. 22, 2016, 11:12 a.m., Shwetha GS wrote:
> > repository/src/main/java/org/apache/atlas/repository/graph/AtlasGraphProvider.java,
line 57
> > <https://reviews.apache.org/r/47810/diff/12/?file=1507186#file1507186line57>
> >
> >     add visible for testing annotation. Will be good if you can move to some test
class instead

Annotations have been added.  I found it useful to have the methods in this class because
they are used by many different test projects.


> On Sept. 22, 2016, 11:12 a.m., Shwetha GS wrote:
> > pom.xml, line 457
> > <https://reviews.apache.org/r/47810/diff/12/?file=1507179#file1507179line457>
> >
> >     where is this used? git grep returns just this result

I guess that at this point it is not needed. since Titan0Database (now Titan0GraphDatabase)
is the default.  Once support for other implementations is merged in, that will be needed
to control which implementation class ends up in the test atlas-application.properties file
in typesystem.  I did include the changes to those property files though.  I guess I'll go
ahead and add them in.


On Sept. 22, 2016, 11:12 a.m., Jeff Hagelberg wrote:
> > Enable checkstyle on new modules with this property in pom.xml - <checkstyle.failOnViolation>true</checkstyle.failOnViolation>
> > 
> > With patch:
> > Running org.apache.atlas.repository.graph.GraphRepoMapperScaleTest
> > Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 113.27 sec - in
org.apache.atlas.repository.graph.GraphRepoMapperScaleTest
> > 
> > Without patch:
> > Running org.apache.atlas.repository.graph.GraphRepoMapperScaleTest
> > Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 89.008 sec - in
org.apache.atlas.repository.graph.GraphRepoMapperScaleTest
> > 
> > With the patch, the test takes longer to run. This implies some performance issue
with the patch
> > 
> > Whats the plan on including titan 1.x implementation - will it be a profile at compile
time such that only one implementation(titan 0.x or titan 1.x) will be part of the package.
Or are both implementations part of the package and the implementation is chosen from the
runtime configuration in application properties? If its compile time binding, then we shouldn't
expose the conf atlas.graphdb.backend?

The plan is for all of the Atlas code to just depend on the abstraction layer.  There will
be profiles in incubator-atlas\pom.xml for each graph backend.  The ones we have in our fork
look like this:

        <profile>
            <id>titan1</id>
            <properties>
                <plugin.class>org.apache.atlas.repository.graphdb.titan1.Titan1Database</plugin.class>
            </properties>
            <dependencyManagement>
                <dependencies>
                    <dependency>
                        <groupId>org.apache.atlas</groupId>
                        <artifactId>atlas-graphdb-impls</artifactId>
                        <version>${project.version}</version>
                        <type>pom</type>
                        <exclusions>
                            <exclusion>
                                <groupId>org.apache.atlas</groupId>
                                <artifactId>atlas-graphdb-titan0</artifactId>
                            </exclusion>
                            <exclusion>
                                <groupId>com.ibm.analytics</groupId>
                                <artifactId>atlas-graphdb-ibm-graph</artifactId>
                            </exclusion>
                        </exclusions>
                    </dependency>
                </dependencies>
            </dependencyManagement>
        </profile>

        <profile>
            <id>ibm-graph</id>
            <properties>
                <plugin.class>com.ibm.analytics.atlas.ibmgraph.IBMGraphDatabase</plugin.class>
                <atlas.TypeCache.impl>org.apache.atlas.repository.typestore.StoreBackedTenantTypeCache</atlas.TypeCache.impl>
                <ibm.graph.graph.id>4000</ibm.graph.graph.id>
            </properties>
            <dependencyManagement>
                <dependencies>
                    <dependency>
                        <groupId>org.apache.atlas</groupId>
                        <artifactId>atlas-graphdb-impls</artifactId>
                        <version>${project.version}</version>
                        <type>pom</type>
                        <exclusions>
                            <exclusion>
                                <groupId>org.apache.atlas</groupId>
                                <artifactId>atlas-graphdb-titan0</artifactId>
                            </exclusion>
                            <exclusion>
                                <groupId>org.apache.atlas</groupId>
                                <artifactId>atlas-graphdb-titan1</artifactId>
                            </exclusion>
                        </exclusions>
                    </dependency>
                </dependencies>
            </dependencyManagement>
        </profile>
    </profiles>

These configure the atlas-graphdb-impls project to just pull in the jar from the implementation
that is being used.  Right now, webapp has a dependency on this project, and will bundle the
graph database implementation uber jar file corresponding to the profile that was selected.
 Unfortunately, it is not possible for the titan0 and titan1 implementations to co-exist in
the war without the use of a custom classloader.  titan0 and titan1 define many classes with
the same name.  In addition, the two require different contents in META-INF/services/javax.script.ScriptEngineFactory.
 Our plan has been to have the implementations available in the filesystem and have the correct
loaded on startup, most likely using a custom classloader.  We have not really investigated
that very much. It's possible that the plugin-classloader that was recently added might be
able to help with this.  Right now, there is no way to switch between implementations without
modifying that atlas war file.  This is 
 probably ok for now, since we have not contributed the titan1 implementation yet, but it
is definitely something we will need to look at.  I am pretty sure that we created a JIRA
for that.

I will do profiling of GraphRepoMapperScaleTest with and without the changes to see if I can
identify and fix the performance regression in GraphRepoMapperScaleTest.


- Jeff


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/47810/#review149947
-----------------------------------------------------------


On Sept. 21, 2016, 1:49 p.m., Jeff Hagelberg wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/47810/
> -----------------------------------------------------------
> 
> (Updated Sept. 21, 2016, 1:49 p.m.)
> 
> 
> Review request for atlas, David Kantor and Neeru Gupta.
> 
> 
> Bugs: ATLAS-694
>     https://issues.apache.org/jira/browse/ATLAS-694
> 
> 
> Repository: atlas
> 
> 
> Description
> -------
> 
> ATLAS-694: Update Atlas to use abstraction layer.  All of the Atlas code (with the exception
of the catalog, which was only updated minimally) has been updated to use the graph database
abstraction layer.  In addition, there is now an optional Atlas configuration property that
specifies the class of the abstraction layer database to use.  I basically put all of the
changes in here with the exception of the actual Titan 1 implementation of code.  This includes
the changes to support Tinkerpop 3 syntax.  This is mostly to expedite getting the changes
into Atlas.  Originally the TP3 changes were going to be brought in as part of the Titan 1
implementation task.
> 
> Change Summary:
> 
>    - change Atlas classes to use AtlasGraph,AtlasVertex,AtlasEdge, etc instead of TitanGraph/Vertex/Edge,
etc
>    - compile time dependency on titan 0.5.4/TP 2 removed (except in Catalog, which was
only changed to use AtlasGraphProvider/AtlasGraph) - see repository\pom.xml, other pom.xmls
>    - updated DSL translation to generate Gremlin that is compliant with TP3 when TP3
is being used.  See GremlinQuery.scala, GraphPersistenceStrategies.scala
>    - TitanGraphProvider replaced by AtlasGraphProvider.  Graph database implementation
is determined from a new optional configuration property
>    - HiveTitanSample is no longer used by tests.  It has been replaced by hive-instances.json
(which uses normal Atlas json syntax).  The data is saved with a new JSONImporter class. 
This was needed because the graphson syntax used by HiveTitanSample is not compatible with
TP3.  
> 
> Last rebase: 9/21/2016
> 
> 
> Diffs
> -----
> 
>   .gitignore e10adbc4457f6297600f0feb01eb54718b8ec406 
>   addons/falcon-bridge/pom.xml 1365bd05a388dc92f7a56c7f7427b5b85f97c7da 
>   addons/hdfs-model/pom.xml 492f39cea085c6e69781e17bcbdbc3a231806df3 
>   addons/hdfs-model/src/test/java/org/apache/atlas/fs/model/HDFSModelTest.java ac60294e328835ba0340e150799ddfb348ccdb52

>   addons/hive-bridge/pom.xml 6993bdb938a6095ca24482e290393eeeb3911bcb 
>   addons/hive-bridge/src/main/java/org/apache/atlas/hive/bridge/HiveMetaStoreBridge.java
ad7a4a5d09d8542a841701dfe04981f65f767c14 
>   addons/sqoop-bridge/pom.xml 8c9d278d43b5979ea1743d10845905c13249f8a6 
>   addons/storm-bridge/pom.xml 12c1208b448d456a923bd7309601174ddb561ba5 
>   catalog/pom.xml 2f58a8f0748de65ab78eab35df6abd2fe7c336af 
>   catalog/src/main/java/org/apache/atlas/catalog/query/BaseQuery.java e7bb505075983371ca12d9bc1d8c6eb240c3d134

>   graphdb/api/src/main/java/org/apache/atlas/repository/graphdb/AtlasGraphManagement.java
c8cd2842ca3090b6bbd384c773b4eb45aff149ce 
>   graphdb/api/src/main/java/org/apache/atlas/repository/graphdb/AtlasPropertyKey.java
315ecddb861e1a1be6e0ab9b36fe4c0a52486ae8 
>   graphdb/graphdb-impls/pom.xml PRE-CREATION 
>   graphdb/pom.xml ad3461741d42ae83b9d3306bfa4f632ecfc06f3b 
>   graphdb/titan0/src/main/java/org/apache/atlas/repository/graphdb/titan0/GraphDbObjectFactory.java
89de23d225c738bcb7f4f502315525af8fa26188 
>   graphdb/titan0/src/main/java/org/apache/atlas/repository/graphdb/titan0/Titan0DatabaseManager.java
b4234d7321d43c8ff7fc247e895226848b6e256d 
>   graphdb/titan0/src/main/java/org/apache/atlas/repository/graphdb/titan0/Titan0PropertyKey.java
1f9f6ef786e38a66528189c47d5b147b13461859 
>   graphdb/titan0/src/main/java/org/apache/atlas/repository/graphdb/titan0/Titan0Vertex.java
b26ff040886c777f1892beb15f09a177f54ea279 
>   graphdb/titan0/src/main/java/org/apache/atlas/repository/graphdb/titan0/TitanObjectFactory.java
ab0e798d24a2890e60109193f86944b069ff0046 
>   graphdb/titan0/src/test/java/org/apache/atlas/repository/graphdb/titan0/AbstractGraphDatabaseTest.java
35735e335fd8f8d47d211d813817e19d9f6a9552 
>   graphdb/titan0/src/test/java/org/apache/atlas/repository/graphdb/titan0/Titan0DatabaseTest.java
6c2ea263a309c6f71c9eb39fdffed39c8e4895a2 
>   pom.xml ac5b0425bc7816261e7c35caad99131c79e75872 
>   repository/pom.xml 663ac874518f0942c6f647ea9ee982503410492d 
>   repository/src/main/java/org/apache/atlas/GraphTransactionInterceptor.java fff8925ebf3b4a7498565f4b9e33885dbb5bc61a

>   repository/src/main/java/org/apache/atlas/RepositoryMetadataModule.java f1ef1404a45b10ee4c1c229417565e711624957f

>   repository/src/main/java/org/apache/atlas/discovery/DataSetLineageService.java c216469ceb1d89b5f6a578f9bd96f01c09cccd06

>   repository/src/main/java/org/apache/atlas/discovery/graph/DefaultGraphPersistenceStrategy.java
b17eec7e55f478bf4403a61cef7585cf06a2d9d9 
>   repository/src/main/java/org/apache/atlas/discovery/graph/GraphBackedDiscoveryService.java
0c029bbb489c048d6ea19b4f4f31555c0d22f924 
>   repository/src/main/java/org/apache/atlas/repository/graph/AtlasGraphProvider.java
PRE-CREATION 
>   repository/src/main/java/org/apache/atlas/repository/graph/DeleteHandler.java 92f98c63081af335e48fc5ff076e277ba6185f60

>   repository/src/main/java/org/apache/atlas/repository/graph/FullTextMapper.java b342e2700d454b0d6fba595b5cc01cd0e06bbdac

>   repository/src/main/java/org/apache/atlas/repository/graph/GraphBackedMetadataRepository.java
263ea465fda0b445a952943def9a6f7c49834f25 
>   repository/src/main/java/org/apache/atlas/repository/graph/GraphBackedSearchIndexer.java
f2e40f9145eb87747430ca98337c53fc7e4864f1 
>   repository/src/main/java/org/apache/atlas/repository/graph/GraphHelper.java 1ce87c9d306faa43fb9d3fdc491c4bcbdd7b2bdb

>   repository/src/main/java/org/apache/atlas/repository/graph/GraphProvider.java f89bdf5c7e8596b11dafced6700d0a4245fd32cf

>   repository/src/main/java/org/apache/atlas/repository/graph/GraphProvider.java f89bdf5c7e8596b11dafced6700d0a4245fd32cf

>   repository/src/main/java/org/apache/atlas/repository/graph/GraphSchemaInitializer.java
6141927eb92bc7e681b43118fbaa399ada6c81f8 
>   repository/src/main/java/org/apache/atlas/repository/graph/GraphToTypedInstanceMapper.java
5c7cb2e8fa32b540f80beed40fb4f25a89d39c56 
>   repository/src/main/java/org/apache/atlas/repository/graph/HardDeleteHandler.java 36367913252b59f7adf7f42924a886365e60819f

>   repository/src/main/java/org/apache/atlas/repository/graph/SoftDeleteHandler.java 25aa7c5844cd3fc39bd7cc9ee23a6c336ca1bfac

>   repository/src/main/java/org/apache/atlas/repository/graph/TitanGraphProvider.java
7a5e6a9c8e0967ad8af2192158f455dd676d20ed 
>   repository/src/main/java/org/apache/atlas/repository/graph/TypedInstanceToGraphMapper.java
2e0414e2cee7ca3d5958650ac6abc8a290473545 
>   repository/src/main/java/org/apache/atlas/repository/typestore/GraphBackedTypeStore.java
a94d1575365656ed5d7e025b2d71635f4623a424 
>   repository/src/main/java/org/apache/atlas/services/DefaultMetadataService.java 35504923166e619baee526d76de89d505ca61377

>   repository/src/main/java/org/apache/atlas/util/AtlasRepositoryConfiguration.java PRE-CREATION

>   repository/src/main/scala/org/apache/atlas/query/ClosureQuery.scala c4621cd509ae049b30a08a9e5b3ace03f888a10c

>   repository/src/main/scala/org/apache/atlas/query/GraphPersistenceStrategies.scala f774d97f878b937d38237989e437b5d826622cbd

>   repository/src/main/scala/org/apache/atlas/query/GremlinEvaluator.scala 10d66a94f95df677d3157d9b31e2b75f522645aa

>   repository/src/main/scala/org/apache/atlas/query/GremlinQuery.scala d336f1ec6d7637ed6d05cd781c66ed5407632e88

>   repository/src/main/scala/org/apache/atlas/query/QueryProcessor.scala 0d2a908201f526aa487dd4cad926d308f53a04ba

>   repository/src/test/java/org/apache/atlas/BaseRepositoryTest.java 500a305d3d1e271cd316fbae4e6790d6e5b066c6

>   repository/src/test/java/org/apache/atlas/RepositoryServiceLoadingTest.java 4195955051d842af973d3ad29602fb97c1e54ecd

>   repository/src/test/java/org/apache/atlas/TestUtils.java bd9df62c81d325f128fe051fd23f5f253661647b

>   repository/src/test/java/org/apache/atlas/discovery/GraphBackedDiscoveryServiceTest.java
40dc861c37a5b1a7d2e3ab4b640e03650c7f22a3 
>   repository/src/test/java/org/apache/atlas/repository/graph/GraphBackedMetadataRepositoryDeleteTestBase.java
550a98e274c80334ad2f1359f80c2557602d9f3f 
>   repository/src/test/java/org/apache/atlas/repository/graph/GraphBackedMetadataRepositoryTest.java
25415418d7b81b6c43816adbc95d55ba2f7445ec 
>   repository/src/test/java/org/apache/atlas/repository/graph/GraphBackedRepositoryHardDeleteTest.java
79b48b5c45fe0c1dc046dde372623b5acc68d39c 
>   repository/src/test/java/org/apache/atlas/repository/graph/GraphBackedRepositorySoftDeleteTest.java
a0af487df06575e746d13ea4cb46153f6ad5b31b 
>   repository/src/test/java/org/apache/atlas/repository/graph/GraphBackedSearchIndexerMockTest.java
f3680ded3834b4e8e1f52815e5e374ca974bb5d6 
>   repository/src/test/java/org/apache/atlas/repository/graph/GraphBackedSearchIndexerTest.java
3291e72b356a21391663d729c3601b939584fc9c 
>   repository/src/test/java/org/apache/atlas/repository/graph/GraphHelperMockTest.java
5ebc2f7cbf0ec3a3bf0eb08c47cc661212b7258d 
>   repository/src/test/java/org/apache/atlas/repository/graph/GraphHelperTest.java ad34aae6076e980689ea2235cd9fd39dac2b125b

>   repository/src/test/java/org/apache/atlas/repository/graph/GraphRepoMapperScaleTest.java
0a870d8ce7b678c513f1e065e46ab77167f8eadd 
>   repository/src/test/java/org/apache/atlas/repository/typestore/GraphBackedTypeStoreTest.java
90e622a0a691c658f9f22a6ae82b9f6acc81079f 
>   repository/src/test/java/org/apache/atlas/repository/typestore/StoreBackedTypeCacheTest.java
b7cf7e9d42d457cb2a641e42a3876244813e25b0 
>   repository/src/test/java/org/apache/atlas/repository/typestore/StoreBackedTypeCacheTestModule.java
058ed4daab37d53f613adc393e774ec206540524 
>   repository/src/test/java/org/apache/atlas/service/DefaultMetadataServiceTest.java 6782970958c39ce568e04d34aab6707f74fa28c9

>   repository/src/test/java/org/apache/atlas/service/StoreBackedTypeCacheMetadataServiceTest.java
8fb59c5a196d7a054cd073105c806f5025857c64 
>   repository/src/test/java/org/apache/atlas/services/DependencyTreeNode.java PRE-CREATION

>   repository/src/test/java/org/apache/atlas/services/JSONImporter.java PRE-CREATION 
>   repository/src/test/resources/hive-instances.json PRE-CREATION 
>   repository/src/test/scala/org/apache/atlas/query/GremlinTest.scala fa48c0e44a445f97c25182abee0f7c69832e254b

>   repository/src/test/scala/org/apache/atlas/query/GremlinTest2.scala f65cedbf44f5fa9e674833a20aae8de6d2537834

>   repository/src/test/scala/org/apache/atlas/query/HiveTitanSample.scala 2dfb67a2b7c36028954304e284eab9da7326a244

>   repository/src/test/scala/org/apache/atlas/query/LineageQueryTest.scala c8b635a9f3710a0de7991f67dd919195a8dd4f7a

>   repository/src/test/scala/org/apache/atlas/query/QueryTestsUtils.scala b5faaf33cd5fd1a63bac4fd41d67588ef519dc11

>   typesystem/src/main/java/org/apache/atlas/typesystem/types/Multiplicity.java 06da32e807aa4b8e4d1f49b3929c9f3c9a8d4846

>   webapp/pom.xml 5ef1a7f3878a4a2c43272a3b8299644fb08a8ac3 
>   webapp/src/main/java/org/apache/atlas/web/listeners/GuiceServletConfig.java a1d3187cea0422988500195191de37732c7df56f

>   webapp/src/main/java/org/apache/atlas/web/resources/BaseService.java d43c8cc64a89ded2348b6760eb6e7f52593c7444

>   webapp/src/test/java/org/apache/atlas/notification/NotificationHookConsumerKafkaTest.java
683a028be9b117b950d3f7962134b86ad2a805b6 
>   webapp/src/test/java/org/apache/atlas/web/listeners/TestGuiceServletConfig.java 08bb125241012b6a1c1852efc6443cc7a4ebecc3

> 
> Diff: https://reviews.apache.org/r/47810/diff/
> 
> 
> Testing
> -------
> 
> Built entire Atlas project, ran all unit/integration tests.  No issues found.
> 
> 
> Thanks,
> 
> Jeff Hagelberg
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message