incubator-s4-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthieu Morel <mmo...@apache.org>
Subject Re: About 200,000 events/s
Date Thu, 27 Jun 2013 16:58:26 GMT
Hi,

the settings we used for the benchmarks are set in the scripts of the s4-benchmarks subprojects.

You probably don't need 4GB of memory per node though.

High CPU usage is usually a good indicator when saturating the system. It means the system
is using all resources available.
What is important to check is that most of the CPU is consumed by user threads and not by
the kernel (rule of thumb: kernel usage < 10%).

Note that throughput figures and improvements are really significant in the context of a given
app and throughput anyway.

Cheers,

Matthieu




On Jun 27, 2013, at 06:55 , Sky Zhao <sky.zhao@ericsson.com> wrote:

> Hi Matthieu, can you share the test report for 200,000 events/s
>  
> e.g.
>  
> Server: 2? Mem: 4G?
> Node: 4
>  
>  
> Running cost:
> Cpu ?%, mem ?%
>  
> I will refer to this  and compare it, thanks first.
> From: Sky Zhao [mailto:sky.zhao@ericsson.com] 
> Sent: Wednesday, June 26, 2013 3:47 PM
> To: 's4-user@incubator.apache.org'
> Subject: RE: About 200,000 events/s
>  
> Also I noticed the cpu is very high almost 100%, but mem<10%, whether it still says
too many IO operation or other causes?
>  
>  
> /Sky
>  
> From: Sky Zhao [mailto:sky.zhao@ericsson.com] 
> Sent: Wednesday, June 26, 2013 10:09 AM
> To: 's4-user@incubator.apache.org'
> Subject: RE: About 200,000 events/s
>  
> Thanks Matthieu very careful suggestions, your direction is right.
>  
> The main reason is serialization/deserlization problems,
> 1)      I changed DataEvent (which extends Event and javaBean) into Event, using default
Event to send s4 event in adapter
> 2)      In s4 app,  create new dataEvent in memory and put stream into next PE,
>  
> Then performance improve a lot, up to 200,000 events/20s maxium, think there still has
space to improve, I keep checking, thanks Matthieu.
>  
>  
>  
> From: Matthieu Morel [mailto:mmorel@apache.org] 
> Sent: Tuesday, June 25, 2013 11:44 PM
> To: s4-user@incubator.apache.org
> Subject: Re: About 200,000 events/s
>  
> From what I see in your code, the problem might be in the definition of keys. In your
GenKeyFinder, you use the event timestamp in the key, and therefore you might be creating
a new PE instance for every single event! (unless the timestamp you set is somehow repeated,
which sounds peculiar).
>  
> I would suggest to modify the keyfinder in a more suitable way. Probably by removing
the timestamp from the key. In the twitter example for instance, the key is the topic of the
tweet, and we aggregate counts by topic.
>  
> Hope this helps,
>  
> Matthieu
>  
> On Jun 25, 2013, at 14:39 , Sky Zhao <sky.zhao@ericsson.com> wrote:
>  
> 
> So I only guess,
> The serialization/deserlization costs much time, and occupy some limited memory,
> Once the serialization/deserlization upp bound is max, it will occupy much memory, the
events starts to be blocked, so more memory in JVM and less (de)serlization, the performance
could be more events for general.
>  
>  
> From: Sky Zhao [mailto:sky.zhao@ericsson.com] 
> Sent: Tuesday, June 25, 2013 8:15 PM
> To: 's4-user@incubator.apache.org'
> Subject: RE: About 200,000 events/s
>  
>  
> I use 4g memory to handle the events, so I feel s4 eat more memory and cpus and server
numbers.
> How many servers and cpu and memory to handle 200,000 events?
>  
>  
> /Sky
>  
>  
> From: Sky Zhao [mailto:sky.zhao@ericsson.com] 
> Sent: Tuesday, June 25, 2013 7:08 PM
> To: 's4-user@incubator.apache.org'
> Subject: RE: About 200,000 events/s
>  
> I list my code here, in my app, I created 3 PEs(the logic is very simple, just emit stream)
>  
>  
>                @Override
>                protected void onInit() {
>                              
>                               CsvReporter.enable(new File(mpath), 20, TimeUnit.SECONDS);
>                              
>                               // create a prototype
>         EntryPE entryPE = createPE(EntryPE.class);
>        
>               
>         createInputStream("seaRawStream", new KeyFinder<Event>() {
>  
>             @Override
>             public List<String> get(Event event) {
>                return Arrays.asList(new String[] { event.get("seadata") });
>             }
>         }, entryPE);
>        
>               
>         ResultPE resultPE = createPE(ResultPE.class);
>                Stream<DataEvent> processStream = createStream("Process Stream",
new GenKeyFinder(), resultPE);
>                processStream.setParallelism(Integer.parseInt(thread));
>  
>               
>         ProcessPE processPE = createPE(ProcessPE.class);
>         processPE.setDataStream(processStream);
>  
>  
>         
>         Stream<DataEvent> entryStream = createStream("Entry Stream", new GenKeyFinder(),
processPE);
>         entryStream.setParallelism(Integer.parseInt(thread));
>  
>                entryPE.setStreams(entryStream);
>                }
>  
>  
>  
> The event data is
> String kpi_name;
>                String mo_name;
>                double kpi_value;
>                long timestamp;
>  
>                public DataEvent() {
>  
>                }
>  
>                public DataEvent(String kpi_name, String mo_name, double kpi_value,
>                                              long timestamp) {
>                               this.kpi_name = kpi_name;
>                               this.mo_name = mo_name;
>                               this.kpi_value = kpi_value;
>                               this.timestamp = timestamp;
>  
>                }
>  
> ….
> Get/set methods
>  
>  
> Keyfinder:
>  
> public class GenKeyFinder implements KeyFinder<DataEvent> {
>               
>  
>     public List<String> get(DataEvent event) {
>  
>         List<String> results = new ArrayList<String>();
>  
>         /* Retrieve the kpi_name,mo_name and add them to the list. */
>         results.add(event.getKpi_name()+":"+event.getMo_name()+":"+String.valueOf(event.getTimestamp()));
>  
>         return results;
>     }
> }
>  
>  
>  
> =====adapter part code
>  
>  
>                                                                            // ////////sending
to s4
>                                                                            DataEvent
event = new DataEvent(kpi_name, mo_name,
>                                                                                     
                    kpi_value, timestamp);
>                                                                            event.put("seadata",
String.class, kpi_name+":"+"mo_name"+":"+timestamp);
>                                                                            rstream.put(event);
>  
>  
> whether the keyfinder or event key impact the event sending/receiving?
> From: Matthieu Morel [mailto:mmorel@apache.org] 
> Sent: Tuesday, June 25, 2013 5:03 PM
> To: s4-user@incubator.apache.org
> Subject: Re: About 200,000 events/s
>  
> Not sure what is the issue in your setting. Performance issues in stream processing can
be related to I/O or GC. But the app design can have a dramatic impact as well. Have a look
at your CPU usage as well.
>  
> You might want to profile the app and adapter to identify the culprit in your application.

>  
> Another path to explore is related to the content of events. Serialization/deserialization
may be costly for complex objects. What kind of data structure are you keeping in the events?

>  
>  
> On Jun 25, 2013, at 09:56 , Sky Zhao <sky.zhao@ericsson.com> wrote:
>  
> 
> Also, can you teach me how to find or trace the block place, I am still confused why
and where is block?
>  
> What are you referring to?
>  
> 
>  
> How many nodes in twitter example for up to 200,000 events/s?
>  
> You'll never get to that number with that application, since the twitter sprinkler feed
is ~ 1% of the total feed, and the reported peak rate of the total feed is a few tens of thousands
of tweets / s 
>  
> But if you were to read from a dump, I'd say a few nodes for the adapter and a few nodes
for the app.
>  
>  
> Matthieu
>  
>  
> 
>  
> From: Sky Zhao [mailto:sky.zhao@ericsson.com] 
> Sent: Tuesday, June 25, 2013 10:25 AM
> To: 's4-user@incubator.apache.org'
> Subject: RE: About 200,000 events/s
>  
> Hi Matthieu, I tried to test again after modifying some configuration codes, see below,
no any PE logic, just send events(only spend 38s for adapter sending events) don’t know
where is blocked?
>  
>  
>  
> From: Matthieu Morel [mailto:mmorel@apache.org] 
> Sent: Tuesday, June 25, 2013 12:25 AM
> To: s4-user@incubator.apache.org
> Subject: Re: About 200,000 events/s
>  
> Hi,
>  
> I would suggest to:
>  
> 1/ check how much you can generate when creating events read from the file - without
event sending to a remote stream. This gives you the upper bound for a single adapter (producer)
>  
> It costs 38s for only file-read from adapters
>  
>  
> 2/ check how much you can consume in the app cluster. By default the remote senders are
blocking, i.e. the adapter won't inject more than what the app cluster can consume. This gives
you an upper bound for the consumer.
> I removed all PE logic, just emit functions, very strange, it still cost 600s, seems
somewhere blocking
>  
>  
> 3/ use more adapter processes. In the benchmarks subprojects, one can configure the number
of injection processes, and you might need more than one
> I tried, seems improve a bit, but not obivouse, 2500 events/s maximum.
>  
> 4/ make sure the tuning parameters you are setting are appropriate. For instance, I am
not sure using 100 threads for serializing events is a good setting (see my notes about context
switching in a previous mail).
> Already changed to 10 threads
>  
> Also note that 200k msg/s/stream/node corresponds to the average rate in one minute _once
the cluster has reached stable mode_. Indeed JVMs typically perform better after a while,
due to various kinds of dynamic optimizations. Do make sure you experiments are long enough.
>  
> Here is metrics report, already run 620s(NO pe logic,)
> List event-emitted@seacluster1@partition-0.csv file
>  
> # time,count,1 min rate,mean rate,5 min rate,15 min rate
> 20,30482,728.4264567247864,1532.1851387286329,577.7027401449616,550.7086872323928
> 40,68256,1003.4693068790475,1710.158498360041,651.1838653093574,576.4097533046603
> 60,126665,1526.423951717675,2114.1420782852447,792.5372602881871,626.2019978354255
> 80,159222,1631.4770294401224,1992.409242530607,865.8704071266087,654.9705320161032
> 100,206876,1821.551542703833,2070.5009200329328,958.5809893144597,691.1999843184435
> 120,245115,1852.9758861695257,2044.0217493195803,1021.5173199955651,718.5310141264627
> 140,286057,1892.8858688327914,2044.444972093294,1084.2498192357743,746.5688788373484
> 160,324662,1952.4301928970413,2030.146074235012,1151.037578221836,776.8090098793148
> 180,371829,1959.4350067303067,2066.6134765239876,1202.907690109737,802.6284106918696
> 200,433557,2283.734039729985,2168.043356631868,1325.4143582100112,853.1681426416412
> 220,464815,2165.047817425384,2113.0019773033414,1362.4434184467839,876.2949286924105
> 240,504620,2065.9291371459913,2102.751240490619,1390.829435765405,896.605433400642
> 260,558158,2192.29985605397,2146.9020962821564,1462.6363950595012,931.9095985316526
> 280,595910,2199.1021924478428,2128.3677435991726,1513.937239488627,961.2098181907241
> 300,627181,1994.0377201896988,2090.694355349902,1511.0411290249551,972.3477079289204
> 320,664342,1959.8933962067783,2076.1406486970463,1534.4815810411815,992.1780521538831
> 340,698717,1912.6664972668054,2055.1073335596802,1551.413320562825,1009.8797759279687
> 360,738468,1940.3861933767728,2051.3430577192726,1579.897070517331,1031.4220907479778
> 380,769673,1808.904535049888,2025.4290025114635,1577.8601886321867,1045.7495364549434
> 400,807406,1834.34177336605,2018.480153382513,1597.9097843354657,1064.241627735365
> 420,851009,1918.7250831019285,2026.1694030316874,1634.835168023956,1088.691609058512
> 440,884427,1854.4469403637247,2010.0049048826818,1637.4121396194184,1101.510206100955
> 460,927835,1951.4786440268247,2016.9731382075943,1672.0801089577474,1125.0229080682857
> 480,975961,2076.6338294064494,2033.1853468961958,1719.2462212252974,1153.158261927912
> 500,1018777,2094.823218886582,2037.4852753404898,1746.4288709310954,1174.8624939692247
> 520,1063733,2137.998709779158,2045.5028969721004,1778.8271760046641,1198.4702839108422
> 540,1105407,2121.91619308999,2046.9043296729049,1798.2960826750962,1217.8576817035882
> 560,1148338,2126.4127152635606,2050.4509561002797,1820.616007231527,1238.2444234074262
> 580,1189999,2113.61602051879,2051.570290715584,1837.4446097375073,1256.7786337375683
> 600,1219584,1937.6445710562814,2031.6753069957329,1814.671731059335,1261.7477397513687
> 620,1237632,1632.4506998235333,1995.2499874637524,1755.4487350018405,1253.8445924435982
>  
>  
>  
> Seems very strange value for my example, far from 200,000 events/s/node/stream
>  
>  
> Regards,
>  
> Matthieu
>  
>  
> On Jun 24, 2013, at 11:19 , Sky Zhao <sky.zhao@ericsson.com> wrote:
>  
> 
> I try to use Adapter to send s4 events. With metrics report,
> 20,10462,88.63259092217602,539.6449108859357,18.577650313690874,6.241814566462701
> 40,36006,417.83633322358764,914.1057643161282,97.55624823196746,33.40088245418529
> 60,63859,674.1012974987167,1075.2326549158463,176.33878995148274,61.646803531230724
> 80,97835,953.6282787690939,1232.2934375999696,271.48890371088254,96.56144395108957
> 100,131535,1162.2060916405578,1323.3704459079934,363.98505627735324,131.98430793014757
> 120,165282,1327.52314133145,1384.2675551261093,453.5195236495672,167.61679021575551
> 140,190776,1305.7285112621298,1368.4361242524062,504.7782182758366,191.36049732440895
>  
> 20,000 events per 20s  => 1000 EVENTS/s
>  
> Very slow, I modify the S4_HOME/subprojects/s4-comm/bin/default.s4.comm.properties
>  
> s4.comm.emitter.class=org.apache.s4.comm.tcp.TCPEmitter
> s4.comm.emitter.remote.class=org.apache.s4.comm.tcp.TCPRemoteEmitter
> s4.comm.listener.class=org.apache.s4.comm.tcp.TCPListener
>  
> # I/O channel connection timeout, when applicable (e.g. used by netty)
> s4.comm.timeout=1000
>  
> # NOTE: the following numbers should be tuned according to the application, use case,
and infrastructure
>  
> # how many threads to use for the sender stage (i.e. serialization)
> #s4.sender.parallelism=1
> s4.sender.parallelism=100
> # maximum number of events in the buffer of the sender stage
> #s4.sender.workQueueSize=10000
> s4.sender.workQueueSize=100000
> # maximum sending rate from a given node, in events / s (used with throttling sender
executors)
> s4.sender.maxRate=200000
>  
> # how many threads to use for the *remote* sender stage (i.e. serialization)
> #s4.remoteSender.parallelism=1
> s4.remoteSender.parallelism=100
> # maximum number of events in the buffer of the *remote* sender stage
> #s4.remoteSender.workQueueSize=10000
> s4.remoteSender.workQueueSize=100000
> # maximum *remote* sending rate from a given node, in events / s (used with throttling
*remote* sender executors)
> s4.remoteSender.maxRate=200000
>  
> # maximum number of pending writes to a given comm channel
> #s4.emitter.maxPendingWrites=1000
> s4.emitter.maxPendingWrites=10000
>  
> # maximum number of events in the buffer of the processing stage
> #s4.stream.workQueueSize=10000
> s4.stream.workQueueSize=100000
>  
> only improve from 500 events 1000 events,
>  
> I read file 88m only need 8s, but send events cost 620s now for 1,237,632 events, why
slow, s4 can trigger 200,000 events/s, how I can do up to this values, pls give me detail
instructions.
>  
>  
>  


Mime
View raw message