heron-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karthik Ramasamy <kart...@streaml.io>
Subject Re: Proposal for Heron API Server
Date Tue, 25 Jul 2017 18:54:57 GMT
Bill - 

The main driving factors for API server are the following - 

- How can heron jobs be managed using purely API instead of using CLI without any dependency?

- A single place for maintaining config including keys (which you don’t want to expose to
every client)

- Reduce the installation steps needed

- Provide authentication support

Rest of my responses are inlined as k>

> On Jul 25, 2017, at 10:15 AM, Bill Graham <billgraham@gmail.com> wrote:
> 
> It's not entirely accurate that Heron's deployment mode is only library
> mode. The Heron scheduler could be implemented to either manage resource
> scheduling from the client (i.e., Aurora Scheduler) or to run the
> scheduling logic on the scheduler framework (i.e., Yarn). ISchedulerClient
> has both LIbrarySchedulerClient and HttpServiceSchedulerClient for these
> two use cases. These are the modes for scheduling single topologies
> components though, and not about managing a centralized scheduling service
> for multi-tenant usage which this proposal is about. Basically, we already
> have Library and Service modes as terminology in the existing codebase, so
> we shouldn't overload the concepts with a new definition of service mode.

k>I agree I might be overloading the terminology here. Whether the scheduler is run 
in library mode vs in the schedule framework is independent of the API server. API
server is not a scheduling service - it just a REST end point server that translates 
REST API into actions. Perhaps a change of terminology might make it easier to understand.
Any suggestions?

> If config distribution is the main issue, have we explored adding support
> for fetching configs from a repository, just as we upload and fetch the
> binary?

k>We did explore this aspect of having a config in a central place. However, there are
issues with this approach

- Heron cli have to download every time it has to submit/kill/activate/deactivate the topologies.
Alternatively,
the config can be cached but it require invalidation and refresh periodically at the client
side - which could lead
to issues.

- All the keys and important stuff could be exposed on the client (if you are working with
cloud environments)

- If we have to manage the jobs programmatically including submission/killing/updating/activating/deactivating,
it
introduces a dependency - such as downloading config before submitting making it cumbersome
for programmers.

> 
> One concern about adding a scheduling service, is that it creates yet
> another service to be maintained, and it increases the matrix of modes of
> deployment available which adds complexity. For example today Aurora
> topologies can be submitted in local mode only, but they can be updated in
> local or service mode. YARN does both submit and update in service mode
> today. With this additional service, we would need to support those modes,
> plus those modes when run behind yet another service. The combination of
> modes gets complex because we now anywhere from 0..2 potential layers of
> services to go through.

k>As pointed out above, this is not a scheduling service - it is just a rest end point.
The API
service will be deployed as yet another job similar to heron-ui and heron-tracker. This service
will be stateless and hence it will be restarted by the scheduler if it dies - which means
it is fault
tolerant. We can run multiple instance of the service as well for scalability.

Furthermore, the API server will preserve those deployment modes for Aurora and YARN - independent
of whether you deploy using API server or directly from the dev machine (like we have now).


> This approach also requires the design of a delegated auth mechanism. For
> example if the deploy service is running as a shared account, how will it
> delegate auth on behalf of the user who is deploying the topology? If we go
> down this path, we'd need to design for this.

k>As I mentioned earlier, one of the motivations for API server is to implement some kind
of authentication
- Kerberos/TLS/LDAP. However, the first phase will be providing the functionality followed
by the 2nd phase
which includes an authentication mechanism.

> I also share Maosong's concern of merging the tracker into the api service.
> The design of the system will be more clear and easy to maintain/manage if
> each system could live independently. If the goal is to make it easier for
> administrators to manage all at once, I'd suggest we handle that with admin
> management scripts that could simplify common tasks without merging the
> service code.

k>In fact, I would argue the other way around - since the main focus of the API server
to provide REST api

- Why not move all the API’s into one single service rather having two?

- Furthermore, the current tracker uses state manager for getting metadata etc. Since tracker
uses python, the state manager functionality needs to be duplicated in python and Java. 

With API server the plan is to write in Java and we can eliminate all the python code for
state manager
thereby reducing duplicate functionality in different languages. Our initial focus to get
this service rolled
out with the first phase of API submit/kill/update/activate/deactivate and in the second phase
we can 
merge the tracker.

Note that the introduction of server does not change in any way the current mode of deployment.

cheers
/karthik

> On Mon, Jul 24, 2017 at 6:27 PM, Karthik Ramasamy <karthik@streaml.io>
> wrote:
> 
>> 1st version of the api server will support the following commands
>> 
>> - submit
>> - kill
>> - update
>> - activate
>> - deactivate
>> 
>> We are designing API server to be stateless and it will run as a job in the
>> scheduler (similar to tracker and UI). With this approach, there is no need
>> to worry about availability issues.
>> 
>> cheers
>> /karthik
>> 
>> On Mon, Jul 24, 2017 at 5:43 PM, Fu Maosong <maosongfu@gmail.com> wrote:
>> 
>>> I like the idea of *service mode* for heron.
>>> 
>>> But we need to be more cautious about merging tracker into API Server,
>>> since it can easily bring scalability and availability issues.
>>> BTW, storm's nimbus serves both topology management requests as well as
>>> metrics requests, which is kind of "merging tracker into API server". We
>>> can learn the pros&cons of such design from it.
>>> 
>>> 
>>> 2017-07-24 16:57 GMT-07:00 Karthik Ramasamy <karthik@streaml.io>:
>>> 
>>>> *Rationale*:
>>>> 
>>>> Currently, Heron supports a single mode of deployment called library
>>> mode.
>>>> Library mode requires several steps and client side configuration which
>>>> could be intensive. Hence, we want to support another mode called
>> service
>>>> mode for simplified deployment.
>>>> 
>>>> *Library Mode:*
>>>> 
>>>> With Heron, the current mode of deployment is called library mode. This
>>>> mode does not require any services running for Heron to deploy which
>> is a
>>>> huge advantage. However, it requires several configuration to be in the
>>>> client side. Because of this administering becomes harder - especially
>>>> maintaining the configuration and distributing them when the
>>> configuration
>>>> is changed. While this is possible for a bigger teams with dedicated
>>>> dev-ops team, it might be overhead for medium and smaller teams.
>>>> Furthermore, this mode of deployment does not have an API to
>>>> submit/kill/activate/deactivate programmatically.
>>>> 
>>>> *Service Mode:*
>>>> 
>>>> In this mode, an api server will be running as a service. This service
>>> will
>>>> be run as yet another job in the scheduler so that it will be restarted
>>>> during machine and process failures thereby providing fault tolerance.
>>> This
>>>> api server will maintain the configuration and heron cli will be
>>> augmented
>>>> to use the rest API to submit/kill/activate/deactivate the topologies
>> in
>>>> this mode. The advantage of this mode is it simplifies deployment but
>>>> requires running a service.
>>>> 
>>>> *Merging Tracker into API Server:*
>>>> 
>>>> Current, Heron tracker written in python duplicates the state manager
>>> code
>>>> in python as well. The API server will support the heron tracker api in
>>>> addition to topologies api. Depending on the mode of the deployment,
>> the
>>>> api server can be deployed in one of the modes - library mode (which
>>>> exposes only the tracker API) and services mode (which exposes both the
>>>> tracker + api server). Initially, the tracker and api server will be in
>>>> separate directory until great amount of testing is done. Once it is
>>>> completed, we can think about cutting over to entirely using API
>> server.
>>>> 
>>>> This change will not affect any of the existing deployments and it will
>>> be
>>>> backward compatible.
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> With my best Regards
>>> ------------------
>>> Fu Maosong
>>> Twitter Inc.
>>> Mobile: +001-415-244-7520
>>> 
>> 


Mime
View raw message