incubator-s4-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthieu Morel <mmo...@apache.org>
Subject Re: S4-0.6.0 and Hadoop Yarn
Date Thu, 04 Apr 2013 09:26:57 GMT
Hi,

Note that S4 0.5 was a complete refactoring, therefore its main objective was to provide a
functional implementation. Thus there was room for improvements and the focus of the 0.6 release
was on performance and usability.

Most performance improvements in S4 0.6 come from:
- adding metrics to identify bottlenecks
- improving serialization and deserialization
- minimizing buffer copies (and pressure on the garbage collector)
- leveraging multithreading and async processing, notably by updating Netty pipelines

Regards,

Matthieu 




On Apr 4, 2013, at 07:01 , Siddharth wrote:

> Hi - Can the development team highlight the exact solution/fix that made it possible
for 0.6 release to be so fast compared to the earlier release.
>  
> Thanks in advance,
> Siddharth
>  
> From: Matthieu Morel [mailto:mmorel@apache.org] 
> Sent: Wednesday, April 03, 2013 3:02 PM
> To: s4-user@incubator.apache.org
> Subject: Re: S4-0.6.0 and Hadoop Yarn
>  
> On Apr 2, 2013, at 19:46 , Jeryl Cook wrote:
> 
> 
> "handle 200K+ messages per sec"  ,in one instance? or do you mean clustered?
>  
> This is for processing small events injected into 1 stream on 1 node. By using more streams
and more nodes the overall throughput can get quite higher. 
>  
> Note that this is a baseline with a basic PE graph (1 injector and 1 PE prototype) and
performance in practice will be impacted by the complexity of the application and the nature
of the processing, the hardware and allocated resources, the size and complexity of messages
etc..
>  
> A benchmarking framework is included in the distribution, so you can reproduce the experiments.
>  
> Regards,
>  
> Matthieu 
>  
>  
>>  
>> 
>> On Mon, Apr 1, 2013 at 10:42 PM, JiHyoun Park <april3@gmail.com> wrote:
>> Hi
>> 
>> I am testing the newest release of S4.
>> It's fantastic that the stream throughput of S4 0.6.0 has been improved to handle
200K+ messages per sec.!
>> However, it seems that S4-25 branch - deploying S4 applications with Yarn - is not
included in the 0.6.0 package yet. 
>> I already built a system to run S4 applications on Yarn and want to migrate its S4
framework from 0.5.0 to 0.6.0.
>> How can I use the 'deploying S4 applications with Yarn' feature on S4 0.6.0?
>> 
>> Best Regards
>> Jihyoun
>> 
>> 
>> 
>> -- 
>> Jeryl Cook
>> Founder & Chief Executive Officer
>> VanitySoft, Inc.
>> A Geo Business Intelligence Technology Consulting Firm
>> www.vanity-soft.com
>> www.linkedin.com/in/jerylcook
>> Get answers to "who knew what, when, and where"... and everything in between.
>> 
>> ____________________________________________________
>> This message contains information which may be confidential and privileged. Unless
you are the addressee (or authorized to receive for the addressee), you may not use, copy
or disclose to anyone the message or any information contained in the message. If you have
received the message in error, please advise the sender by reply e-mail jeryl.cook@vanity-soft.com,
and delete the message.
> 
>  


Mime
View raw message