atlas-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "venkata madugundu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ATLAS-683) Refactor local type-system cache with cache provider interface
Date Wed, 27 Apr 2016 17:17:12 GMT

    [ https://issues.apache.org/jira/browse/ATLAS-683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15260515#comment-15260515
] 

venkata madugundu commented on ATLAS-683:
-----------------------------------------

[~yhemanth] Hi Hemanth, I have added you as a reviewer to start with for this JIRA on the
reviewboard.

The patch does the things below..

- Introduces a new interface org.apache.atlas.typesystem.types.cache.ITypeCacheProvider.
- The default implementation 'DefaultTypeCacheProvider' would mimic the existing implementation
of an in-memory cache.
- A new property 'atlas.typesystem.cache.provider' has been added to atlas-application.properties,
to allow dynamic cache providers to be registered.
- Unit tests for DefaultTypeCacheProvider are added. All other impacted tests are modified.
- The TypeSystem class has been modified to make use of instance of an ITypeCacheProvider.
- An instance of ITypeCacheProvider is dynamically bound using Guice (as such there is no
dynamic binding with Guice, but tried simulating that)

As far as the testing is concerned, ran all tests from Maven. Ran quick_start.py, Tested UI
to some extent. Found no issues in log.

NOTE - To validate the 'completeness' of the interface 'ITypeCacheProvider',  experimented
with another implementation of ITypeCacheProvider based on 'Redis' cache. Redis integration
needs few more utility API to be added to serialize/deserialize IDataType subtypes like ClassType
| StructType | TraitType | EnumType. I have not included those changes in this patch.


> Refactor local type-system cache with cache provider interface
> --------------------------------------------------------------
>
>                 Key: ATLAS-683
>                 URL: https://issues.apache.org/jira/browse/ATLAS-683
>             Project: Atlas
>          Issue Type: Sub-task
>    Affects Versions: 0.7-incubating
>            Reporter: venkata madugundu
>            Assignee: venkata madugundu
>            Priority: Critical
>              Labels: high-availability, performance, scalability
>             Fix For: 0.7-incubating
>
>         Attachments: ATLAS-683.patch
>
>
> As noted in ATLAS-488, local type-system cache makes Atlas runtime stateful and prevents
multiple Atlas instances to be active in a cluster. Either the type-cache should be synched
across Atlas instances (on all type create/update requests) or the type-cache should be moved
out of Atlas to something like a distributed cache. 
> 1. As a first step, the local type-cache code in TypeSystem.java can be refactored to
be carved out as an interface like TypeCacheProvider (whose default implementation for a standalone
Atlas server would just use in-process local cache). The cache provider implementation itself
could be specified as an optional configuration property. Expert users of Atlas can choose
to inject a custom cache provider which can likely hit a distributed cache. We are evaluating
the use of a distributed cache. 
> 2. As a second step, some more refactoring can be done to minimize/optimize the calls
made to TypeSystem for type lookup queries. Essentially, in a given transaction/request, once
a type lookup is done, it should not be requeried again. A request scoped variable (guice
would probably help with that scoping) can hold all the lookups made in a request. This might
sound like a cache of a cache, but I think it should help in reducing the hits to cache provider
if the provider is hitting a remote cache.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message