incubator-s4-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "kishore gopalakrishna (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (S4-10) Dynamic s4 cluster management:S4 as a service
Date Mon, 05 Dec 2011 08:31:39 GMT

    [ https://issues.apache.org/jira/browse/S4-10?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13162671#comment-13162671
] 

kishore gopalakrishna commented on S4-10:
-----------------------------------------

Flavio. Support to add adhoc query is part of cluster management and is valuable feature for
s4 as a service, but as you said it needs separate discussion and issue for design and implementation.
Think of instead of deploying an app to count the number of tweets that contain a word "hadoop".
Instead of writing one App and deploying it for every word. One must be able to add say something
like "select count(*) from twit_stream where message like '%hadoop%'". The cluster management
admin interface must support this feature.

The third point is to cater to a use case where one need not always deploy an App for trivial
things like count, max, avg etc. We might already provide some basic PE's within S4. So one
can simply say add CountPE to twit_stream and we should some how construct App with this parameters
and start it in every s4 container.

Does it make sense?



                
> Dynamic s4 cluster management:S4 as a service
> ---------------------------------------------
>
>                 Key: S4-10
>                 URL: https://issues.apache.org/jira/browse/S4-10
>             Project: Apache S4
>          Issue Type: New Feature
>            Reporter: kishore gopalakrishna
>             Fix For: 0.5
>
>
> One of the features that we will definitely need to make S4 as a service is being able
to do following operations without bringing down the cluster
> * Deploy/undeploy applications. High level goal is to be able to provide a uri which
will point to the application jar and deploy it to all nodes in s4 cluster.
> * Configure/add new stream into the system, this will basically contain how is the stream
partitioned. Currently each stream is partitioned based on number of nodes but this may not
always be desirable.
> * Configure/Add Processing Elements. Basically attach a PE to a stream without writing
any custom code.
> * Add query, one should be able to submit a query for common tasks like count/avg/min/max/.
This can eventually be expanded to support UDF.
> Will use this as a high level JIRA to make S4 as a service. Only features will be added
to this JIRA. For each feature we can create child JIRA 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message