atlas-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ashutosh Mestry (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ATLAS-1995) Performance of Entity Creation Can Be Improved By Using Index Query to Fetch Entity Using Unique Attributes
Date Thu, 27 Jul 2017 05:13:00 GMT

    [ https://issues.apache.org/jira/browse/ATLAS-1995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16102717#comment-16102717
] 

Ashutosh Mestry commented on ATLAS-1995:
----------------------------------------

Preliminary analysis of the implementation seem to result in 50% improvement. The improvements
depends on the amount of data present in the database.

For database with less data, the 2 approaches yield comparable fetch times.

For database with large amount of data, the improvement is upwards of 50%.

> Performance of Entity Creation Can Be Improved By Using Index Query to Fetch Entity Using
Unique Attributes 
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: ATLAS-1995
>                 URL: https://issues.apache.org/jira/browse/ATLAS-1995
>             Project: Atlas
>          Issue Type: Improvement
>          Components:  atlas-core
>    Affects Versions: trunk, 0.8-incubating
>            Reporter: Ashutosh Mestry
>            Assignee: Ashutosh Mestry
>
> *Background*
> On profiling entity creation flow, it was observed that several calls are made to _AtlasGraphUtilsV1.getVertexByUniqueAttributes_.

> These calls result in querying database using graph query. There is a potential for improving
this if index query was used.
> *Analysis*
> Upon experimentation, it was found that there is a 50% improvement in performance of
entity creation if this method was replaced with equivalent that uses _indexQuery_.
> Also, when large number of entities are created (typically using _import_hive.sh_), the
CPU usage on Atlas was reduced, as the Solr was being used for doing some of the work.
> *Implementation Guidance*
> * Add new method to _AtlasGraphUtilsV1.getAtlasVertexFromIndexQuery_ that will use _AtlasGraphProvider.indexQuery_
to fetch vertices.
> * Ensure that query created is 'escaped' appropriately.
> * Include logic to fallback to graph query if the property being queried for is not indexed.
> Since this is a high-impact change, it will be worth while to verify other dependent
modules.
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message