atlas-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Venkata R Madugundu" <venkataraman...@in.ibm.com>
Subject Re: Atlas service scalability
Date Mon, 25 Jan 2016 12:21:00 GMT
Hi Hemanth,

Thanks very much for that quick reply.

> Are you thinking of horizontal scale for the Atlas server because
> of the load of Metadata queries from clients? Could you share your
> use case in some more detail?

You are right. We are expecting the quantum of metadata persisted/accessed
by clients to be very large (and growing). And hence a requirement for
horizontal
distribution of Atlas runtime.

The concrete usecases we are looking at...

a) Cloud deployment for a multi-tenant metadata repository. A single
metadata
repository would be shared by multiple tenants but with metadata isolation
per
tenant. In other words, the metadata repository is more or less seen as
a regular database (rather an 'Object' database) for analytics applications
on cloud.

b) On-prem deployment for large enterprises which are looking for
consolidation
of enterprise wide metadata into a single metadata repository. The metadata
aggregation is achieved with periodic (daily) synchronization of metadata
using long running metadata imports.

> Given the shared state implementation we have, there are no
> workarounds immediately, I am afraid.

OK. I am assuming, there is some sort of optimization (likely caching) done
in the Atlas code
and hence that statefullness of Atlas which does not allow multiple active
instances.

Thanks
Venkata Madugundu


Hemanth Yamijala <hyamijala@hortonworks.com> wrote on 25/01/2016 05:18:26
PM:

> From: Hemanth Yamijala <hyamijala@hortonworks.com>
> To: "dev@atlas.incubator.apache.org" <dev@atlas.incubator.apache.org>
> Date: 25/01/16 05:18 PM
> Subject: Re: Atlas service scalability
>
> Venkata,
>
> Answers inline.
>
> Thanks
> Hemanth
>
>
>
>
>
> On 1/25/16, 4:56 PM, "Venkata R Madugundu" <venkataramana.m@in.ibm.com>
wrote:
>
> >
> >
> >Hi,
> >
> >We are evaulating the ways in which Atlas can scale. I am looking for
way
> >in which I can run multiple instances of Atlas service/runtime for
> >horizontal scale (An instance running per node in a cluster). Is this
> >possible currently ?
>
> As you note below, the Atlas Service is not currently horizontally
> scalable. However, the backends which store the metadata (HBase,
> Solr etc) can be configured to achieve horizontal scale.
>
> >
> >I went through the Atlas page on 'Fault tolerance and HA'. It clearly
says
> >"Currently, the Atlas Web service has a limitation that it can only have
> >one active instance at a time".  I also went through one post on the
> >mailing list related to this with the subject 'Atlas with
hbase/solr/kafka'
> >which seems to say there can be only one active instance of Atlas
service
> >at any given point of time with the current state of Atlas
implementation.
> >
> >So, can you please confirm that my understanding is correct, that for
now I
> >can only run one active Atlas instance at any given point of time (in
other
> >words multiple instances of Atlas service cannot be running
simultaneously
> >in a cluster to achieve horizontal scale). If this is correct, are there
> >any interim workarounds to the solution that one can make use of ?
>
> Given the shared state implementation we have, there are no
> workarounds immediately, I am afraid. Are you thinking of horizontal
> scale for the Atlas server because of the load of Metadata queries
> from clients? Could you share your use case in some more detail?
>
> >
> >Thanks
> >Venkata Madugundu

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message