incubator-s4-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kaiser Md. Nahiduzzaman" <kaiserna...@gmail.com>
Subject Re: S4-Piper: Scalability in input adapter
Date Thu, 11 Oct 2012 17:34:34 GMT
Hi Kishore,
Thank you so much for your prompt reply.

Actually, I am able to pull events fast enough for twitter. But I was
thinking of different applications for example video streams and there
could be more than one video stream. In that case, if we have only one
adapter node to process all the video streams then that might be a
bottleneck. I just asked the input adapter problem on the given
twitter example to better understand how to scale the input adapters.

"Start multiple adapters, but in each adapter after getting the top
level status, hash it on userid and filter it accordingly.  For
example, if you have 2 adapters, each adapter filters 50% of the
messages based on user id."
-- Even in this case, each input adapter will get the top level
status, in case of twitter, only receiving the data is not very large,
but where the input data itself can be very large, is there anyway to
distribute the input data itself?

Thanks,
Kaiser

On Thu, Oct 11, 2012 at 10:14 AM, kishore g <g.kishore@gmail.com> wrote:
> Hi Kaiser,
>
> Can you give more information as to why you need to scale
> TwitterInputAdapter. Are you not able to pull events fast enough ?.
>
> Can you explain how you plan to scale this. The reason i ask this is
> twitter provides only one stream, it is not partitioned. The way to
> scale is https://dev.twitter.com/docs/streaming-apis/processing#Scaling.
> This is pretty much what the twitteradapter is doing, it is simply
> delegating it AppNodes. So in theory, you should be good with one
> TwitterInputAdapter. If this does not work, then you can try the
> following.
>
> Start multiple adapters, but in each adapter after getting the top
> level status, hash it on userid and filter it accordingly.  For
> example, if you have 2 adapters, each adapter filters 50% of the
> messages based on user id.
>
> If you can give us additional information on what you plan to do and
> some numbers, we will be able to provide better instructions on how to
> solve it with s4.
>
> thanks,
> Kishore G
>
>
>
> On Thu, Oct 11, 2012 at 9:41 AM, Kaiser Md. Nahiduzzaman
> <kaisernahid@gmail.com> wrote:
>> Hi,
>> The S4-piper overview says "Since adapters are also S4 applications,
>> they can be scaled easily."
>> I was wondering how to do that. For example, if I create more than one
>> instances of the HelloInputAdapter, then will the input stream
>> automatically get divided to the adapter as it does in case of
>> incoming streams to the multiple HelloApp nodes?
>> Even if that is possible for HelloInputAdapter, how would you do that
>> for TwitterInputAdapter i.e how do you provide scalability to
>> TwitterInputAdapter?
>>
>> Thanks in advance,
>> Kaiser

Mime
View raw message