-------- Original Message --------
Subject: Re: [Fwd: Re: Parsing Olio runtimeStats]
Date: Wed, 12 May 2010 15:18:36 -0700
From: Akara Sucharitakul <akara.sucharitakul@oracle.com>
Reply-To: akara.sucharitakul@oracle.com
Organization: Oracle
To: Vasileios Kontorinis <bkontorinis@gmail.com>
CC: olio-user@incubator.apache.org, "akara.sucharitakul" <Akara.Sucharitakul@sun.com>, Shanti Subramanyam <shanti.subramanyam@gmail.com>
References: <4BE1E133.2070708@Sun.COM> <AANLkTik2eo15cyLPcvghtKebNHqf6Rh9Xn49wiTgvQ20@mail.gmail.com>


I don't think so. CErr is very straightforward logic. If the requests all wait for the response but have not yet timed out, we cannot conclude there are any errors. So the CErr count is 0 for that period. This is very common behavior when the system runs out of resources.

-Akara

On 05/12/10 02:53 PM, Vasileios Kontorinis wrote:
Akara hi again,
   Something weird is going on. Check the following driver.log part from the steady state...

May 4, 2010 10:22:00 PM com.sun.faban.driver.engine.MasterImpl$StatsWriter dumpStats
INFO: 115.01s - UIDriver: HomePage/Login/TagSearch/EventDetail/PersonDetail/AddPerson/AddEvent CThru=13.997/3.999/17.397/12.398/1.800/0.400/0.800 OThru=12.763/5.141/17.398/12.399/1.388/0.518/0.953 CErr=0.000/0.000/0.000/0.000/0.000/0.000/0.000 CResp=0.008/0.008/0.020/0.012/0.075/0.024/0.140 OResp=0.056/0.058/0.076/0.057/0.150/0.035/0.186 CSD=0.000/0.000/0.059/0.000/0.000/0.005/0.000 OSD=0.329/0.302/0.349/0.345/0.307/0.033/0.200 C90%Resp=0.020/0.020/0.040/0.020/0.140/0.020/0.160 O90%Resp=0.020/0.020/0.060/0.040/0.180/0.080/0.160
May 4, 2010 10:22:05 PM com.sun.faban.driver.engine.MasterImpl$StatsWriter dumpStats
INFO: 120.01s - UIDriver: HomePage/Login/TagSearch/EventDetail/PersonDetail/AddPerson/AddEvent CThru=12.803/5.401/14.603/12.202/2.200/0.000/1.400 OThru=12.765/5.155/17.243/12.388/1.433/0.489/0.978 CErr=0.000/0.000/0.000/0.000/0.000/0.000/0.000 CResp=0.008/0.008/0.018/0.010/0.070/-/0.149 OResp=0.053/0.055/0.073/0.055/0.143/0.035/0.184 CSD=0.007/0.000/0.000/0.000/0.000/-/0.000 OSD=0.319/0.293/0.341/0.336/0.292/0.033/0.193 C90%Resp=0.020/0.020/0.040/0.020/0.120/-/0.180 O90%Resp=0.020/0.020/0.060/0.040/0.180/0.080/0.180
May 4, 2010 10:22:10 PM com.sun.faban.driver.engine.MasterImpl$StatsWriter dumpStats
INFO: 125.01s - UIDriver: HomePage/Login/TagSearch/EventDetail/PersonDetail/AddPerson/AddEvent CThru=11.400/5.800/18.400/11.800/0.800/0.600/1.000 OThru=12.694/5.189/17.304/12.357/1.400/0.495/0.979 CErr=0.000/0.000/0.000/0.000/0.000/0.000/0.000 CResp=0.008/0.007/0.018/0.011/0.129/0.031/0.138 OResp=0.051/0.052/0.070/0.052/0.142/0.035/0.181 CSD=0.000/0.019/0.017/0.006/0.000/0.000/0.010 OSD=0.312/0.285/0.332/0.328/0.288/0.032/0.188 C90%Resp=0.020/0.020/0.040/0.020/0.180/0.040/0.160 O90%Resp=0.020/0.020/0.060/0.040/0.180/0.080/0.180
May 4, 2010 10:22:19 PM com.sun.faban.driver.engine.MasterImpl$StatsWriter dumpStats
INFO: 133.57s - UIDriver: HomePage/Login/TagSearch/EventDetail/PersonDetail/AddPerson/AddEvent CThru=9.116/4.324/12.973/9.233/0.468/0.117/0.468 OThru=12.398/5.118/16.946/12.099/1.323/0.463/0.937 CErr=0.000/0.000/0.000/0.000/0.000/0.000/0.000 CResp=0.021/0.007/0.033/0.075/0.094/0.033/0.177 OResp=0.049/0.049/0.068/0.054/0.141/0.035/0.181 CSD=0.115/2.938/0.108/0.372/0.000/0.000/0.380 OSD=0.304/0.276/0.322/0.331/0.284/0.031/0.199 C90%Resp=0.020/0.020/0.060/0.040/0.120/0.020/0.220 O90%Resp=0.020/0.020/0.060/0.040/0.180/0.080/0.180
May 4, 2010 10:22:20 PM com.sun.faban.driver.engine.MasterImpl$StatsWriter dumpStats
INFO: 135.01s - UIDriver: HomePage/Login/TagSearch/EventDetail/PersonDetail/AddPerson/AddEvent CThru=3.463/1.385/4.155/4.155/0.000/0.000/0.000 OThru=12.275/5.066/16.770/11.989/1.305/0.457/0.924 CErr=0.000/0.000/0.000/0.000/0.000/0.000/0.000 CResp=0.775/0.170/0.131/1.253/-/-/- OResp=0.052/0.050/0.068/0.059/0.141/0.035/0.181 CSD=1.237/0.078/13.954/0.935/-/-/- OSD=0.316/0.276/0.321/0.346/0.284/0.031/0.199 C90%Resp=0.320/0.160/0.180/1.700/-/-/- O90%Resp=0.020/0.020/0.060/0.040/0.180/0.080/0.180
May 4, 2010 10:22:25 PM com.sun.faban.driver.engine.MasterImpl$StatsWriter dumpStats
INFO: 140.01s - UIDriver: HomePage/Login/TagSearch/EventDetail/PersonDetail/AddPerson/AddEvent CThru=0.000/0.000/0.000/0.000/0.200/0.000/0.000 OThru=11.717/4.836/16.008/11.444/1.254/0.436/0.882 CErr=0.000/0.000/0.000/0.000/0.000/0.000/0.000 CResp=-/-/-/-/1.751/-/- OResp=0.052/0.050/0.068/0.059/0.153/0.035/0.181 CSD=-/-/-/-/0.000/-/- OSD=0.316/0.276/0.321/0.346/0.315/0.031/0.199 C90%Resp=-/-/-/-/0.020/-/- O90%Resp=0.020/0.020/0.060/0.040/0.180/0.080/0.180
May 4, 2010 10:22:30 PM com.sun.faban.driver.engine.MasterImpl$StatsWriter dumpStats
INFO: 145.01s - UIDriver: HomePage/Login/TagSearch/EventDetail/PersonDetail/AddPerson/AddEvent CThru=0.000/0.000/0.000/0.000/0.000/0.000/0.000 OThru=11.208/4.626/15.312/10.947/1.200/0.417/0.843 CErr=0.000/0.000/0.000/0.000/0.000/0.000/0.000 CResp=-/-/-/-/-/-/- OResp=0.052/0.050/0.068/0.059/0.153/0.035/0.181 CSD=-/-/-/-/-/-/- OSD=0.316/0.276/0.321/0.346/0.315/0.031/0.199 C90%Resp=-/-/-/-/-/-/- O90%Resp=0.020/0.020/0.060/0.040/0.180/0.080/0.180
May 4, 2010 10:22:35 PM com.sun.faban.driver.engine.MasterImpl$StatsWriter dumpStats
INFO: 150.01s - UIDriver: HomePage/Login/TagSearch/EventDetail/PersonDetail/AddPerson/AddEvent CThru=0.000/0.000/0.000/0.000/0.000/0.000/0.000 OThru=10.741/4.433/14.674/10.491/1.150/0.400/0.808 CErr=0.000/0.000/0.000/0.000/0.000/0.000/0.000 CResp=-/-/-/-/-/-/- OResp=0.052/0.050/0.068/0.059/0.153/0.035/0.181 CSD=-/-/-/-/-/-/- OSD=0.316/0.276/0.321/0.346/0.315/0.031/0.199 C90%Resp=-/-/-/-/-/-/- O90%Resp=0.020/0.020/0.060/0.040/0.180/0.080/0.180
May 4, 2010 10:22:40 PM com.sun.faban.driver.engine.MasterImpl$StatsWriter dumpStats
INFO: 155.01s - UIDriver: HomePage/Login/TagSearch/EventDetail/PersonDetail/AddPerson/AddEvent CThru=0.000/0.000/0.000/0.000/0.000/0.000/0.000 OThru=10.311/4.256/14.087/10.071/1.104/0.384/0.776 CErr=0.000/0.000/0.000/0.000/0.000/0.000/0.000 CResp=-/-/-/-/-/-/- OResp=0.052/0.050/0.068/0.059/0.153/0.035/0.181 CSD=-/-/-/-/-/-/- OSD=0.316/0.276/0.321/0.346/0.315/0.031/0.199 C90%Resp=-/-/-/-/-/-/- O90%Resp=0.020/0.020/0.060/0.040/0.180/0.080/0.180
May 4, 2010 10:22:45 PM com.sun.faban.driver.engine.MasterImpl$StatsWriter dumpStats


I have highlighted the interesting parts.
The execution proceeds nornally till a certain point, when (I have verified with vmstat) it starts swapping (there is not enough memory in the virtual machine). At this point cpu goes to 100% and  CThru is stack at 0 after that and no CErr are reported.
Any intuition on CErr is stuck at 0 as well?
Is there any way to detect this situation, besides looking at the throughput numbers? The expected behavior would be to log a huge amount of CErr, all the operations basically during the interval. Right?

Thanks
-------------------------------------------------------------------
Kontorinis Vasileios
Phd student, University of California San Diego
San Diego, CA 92122
Cell. phone: (858) 717 6899
bkontorinis@gmail.com, vkontori@ucsd.edu
-------------------------------------------------------------------


2010/5/5 Akara Sucharitakul <Akara.Sucharitakul@sun.com>
Hmmm, the apache mailing list won't let my current email post a reply. Trying again.

-Akara

-------- Original Message --------
Subject: Re: Parsing Olio runtimeStats
Date: Wed, 05 May 2010 14:17:35 -0700
From: Akara Sucharitakul <akara.sucharitakul@oracle.com>
Reply-To: akara.sucharitakul@oracle.com
Organization: Oracle
To: olio-user@incubator.apache.org
CC: akara.sucharitakul <Akara.Sucharitakul@sun.com>, Shanti Subramanyam <shanti.subramanyam@gmail.com>
References: <s2s89c38a6f1005051212z7cb84f71o589fa41c7ca8704d@mail.gmail.com>


Let me try to address this below:


On 05/05/10 12:12, Vasileios Kontorinis wrote:
Akara and Shanti hi,
    I am parsing the runtimeStats from the driver.log file and I run into this interesting issue.
In the output for steady state I get "-" for the response time when there are no successful operations since the last time runtimeStats were printed.
This can happen for two reasons:
1) The interval for printing the runtimeStats is small  (in my case 5secs) and some operations, especially the ones that take long and have small frequency in matrix (add event, add user) just never happened.
Yes, if the number of successful operations in that period is 0, most calculations will be a divide by 0. That's why it is printing a '-'.

2) There are a bunch of requests going on and none of them is successful. This can happen when there is no sufficient memory in the machine. In that case there is lots of swapping, the cpu goes to 100% utilization and all the operations time-out.

Is there any small change I can do to distinguish between the two cases? Maybe printing a small "t" when there are many requests timing out, so that I can distinguish between the two cases? Any ideas are welcome.
There is the error count in the runtime stats that tell you about error cases. This would also include timeouts. But the problem is not that simple. The real problem is not the operations timing out but rather the operations waiting that would time out. And  we cannot distinguish between these waiting operations and other operations waiting just with slow response time.


One more thing. I use the variableLoad setting and in order  to know how many users are simulated while the benchmark run, I  parse log.xml looking for "Active threads: ". The problem is that log.xml is big especially when many requests timeout. Is it easy to also log the changes of active threads in the drive.log file ??

Please look into log configuration in $FABAN/config/logging.properties. You should be able to make certain loggers log to a particular file. I don't have the detail off  the  top of my head.

-Akara