From olio-user-return-321-apmail-incubator-olio-user-archive=incubator.apache.org@incubator.apache.org Fri Feb 12 23:31:05 2010
Return-Path: <olio-user-return-321-apmail-incubator-olio-user-archive=incubator.apache.org@incubator.apache.org>
Delivered-To: apmail-incubator-olio-user-archive@minotaur.apache.org
Received: (qmail 88677 invoked from network); 12 Feb 2010 23:31:05 -0000
Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3)
  by minotaur.apache.org with SMTP; 12 Feb 2010 23:31:05 -0000
Received: (qmail 34367 invoked by uid 500); 12 Feb 2010 23:31:05 -0000
Delivered-To: apmail-incubator-olio-user-archive@incubator.apache.org
Received: (qmail 34328 invoked by uid 500); 12 Feb 2010 23:31:04 -0000
Mailing-List: contact olio-user-help@incubator.apache.org; run by ezmlm
Precedence: bulk
List-Help: <mailto:olio-user-help@incubator.apache.org>
List-Unsubscribe: <mailto:olio-user-unsubscribe@incubator.apache.org>
List-Post: <mailto:olio-user@incubator.apache.org>
List-Id: <olio-user.incubator.apache.org>
Reply-To: olio-user@incubator.apache.org
Delivered-To: mailing list olio-user@incubator.apache.org
Received: (qmail 34319 invoked by uid 99); 12 Feb 2010 23:31:04 -0000
Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136)
    by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 12 Feb 2010 23:31:04 +0000
X-ASF-Spam-Status: No, hits=3.7 required=10.0
	tests=HTML_MESSAGE,SPF_PASS,WEIRD_PORT
X-Spam-Check-By: apache.org
Received-SPF: pass (athena.apache.org: domain of shanti.subramanyam@gmail.com designates 209.85.160.47 as permitted sender)
Received: from [209.85.160.47] (HELO mail-pw0-f47.google.com) (209.85.160.47)
    by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 12 Feb 2010 23:30:53 +0000
Received: by pwi5 with SMTP id 5so219526pwi.6
        for <olio-user@incubator.apache.org>; Fri, 12 Feb 2010 15:30:33 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=gamma;
        h=domainkey-signature:mime-version:received:in-reply-to:references
         :date:message-id:subject:from:to:cc:content-type;
        bh=UtVD1FuiCZLLveJVGo5NzABie04nDjyrDJJI5283Fbk=;
        b=gxqNFefpR9Sh6R67WqI5pQZFvmlx6sbT5u5Of4jAY6cTQqs+3h/ySsNXhe7640JIaH
         briHhpqBurtnZSBaBfPHFKG1J7ow4yOR6uM7Q2GwWOkTRerkL4Y+n8c6iAQBehExkB5/
         hCV6jK3YNsMo0XKVnNj58qE8barTwLV4LvL8Q=
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=gamma;
        h=mime-version:in-reply-to:references:date:message-id:subject:from:to
         :cc:content-type;
        b=ddtLrZkIgNy8p0UVndJHXqkD7neY8lUVUO+sMTpbI8AD1ZIeBOCXrzFGxoy1/w9iCl
         d9GBKTcDLk96jbh+FwMfBm32qatxHdrdBo4fKNM30XsYhfGsu9QuR+O8PWTX98JGuOcG
         ZEe7g17MYByNgIoXFzNJvhx9W6ps4mwF7H/VM=
MIME-Version: 1.0
Received: by 10.114.11.2 with SMTP id 2mr1407829wak.73.1266017432186; Fri, 12 
	Feb 2010 15:30:32 -0800 (PST)
In-Reply-To: <89c38a6f1002112142l5878f9bek606c6ee2d0cbbfa1@mail.gmail.com>
References: <89c38a6f1002080118x6c8bb8a9k9f3b6e8934734915@mail.gmail.com>
	 <59d35f41002081003j74691013ice25fc5320e3d13c@mail.gmail.com>
	 <89c38a6f1002081553r853607tfa5b960556cb43b9@mail.gmail.com>
	 <59d35f41002081828l7aab11d4md323f15e60a7aea@mail.gmail.com>
	 <89c38a6f1002112142l5878f9bek606c6ee2d0cbbfa1@mail.gmail.com>
Date: Fri, 12 Feb 2010 15:30:32 -0800
Message-ID: <59d35f41002121530ha145205ka3fbf856b0195956@mail.gmail.com>
Subject: Re: Olio Scaling
From: Shanti Subramanyam <shanti.subramanyam@gmail.com>
To: Vasileios Kontorinis <bkontorinis@gmail.com>
Cc: olio-user@incubator.apache.org
Content-Type: multipart/alternative; boundary=00504502e31a533104047f6fa728

--00504502e31a533104047f6fa728
Content-Type: text/plain; charset=ISO-8859-1

If you want to run multiple webservers on different systems, you must have
access to the filestore from all of them. The easiest way to do this is to
nfs-mount the filestore from the server it resides on so it is accessible to
the other machines as well.

Shanti

On Thu, Feb 11, 2010 at 9:42 PM, Vasileios Kontorinis <bkontorinis@gmail.com
> wrote:

> Shanti hi again,
>    Sorry for not submitting the JIRA on time, I am extremely busy lately.
>
> I have a fast question regarding the way the webserver interacts with the
> filestore. I run some scaling studies with one, two and three different
> server while having only one filestore (I do specify that in the run.xml
> configuration file, webServer and dataStorage ).
> The filestore is a local folder on one of the server machines. However, in
> the oliophp/etc/config.php I also specify on each server
>
> $olioconfig['fileSystem'] = 'LocalFS';
> $olioconfig['localfsRoot'] = '/home/gdhiman/filestore';
>
> As a result, I do get WARNINGS for missing files on the webserver that do
> not host a filestore. What is the right configuration for
> oliophp/etc/config.php? Can I somehow detach the filestore from the
> webserver so that it requests files remotely?
>
>
> Thanks again.
> -------------------------------------------------------------------
> Kontorinis Vasileios
> Phd student, University of California San Diego
> San Diego, CA 92122
> Cell. phone: (858) 717 6899
> bkontorinis@gmail.com, vkontori@ucsd.edu
> -------------------------------------------------------------------
>
>
> 2010/2/8 Shanti Subramanyam <shanti.subramanyam@gmail.com>
>
>
>>
>> On Mon, Feb 8, 2010 at 3:53 PM, Vasileios Kontorinis <
>> bkontorinis@gmail.com> wrote:
>>
>>>
>>>
>>>> We need to look into this issue  - I suspect that something subtle has
>>>> changed in 0.2 which hasn't got accounted for in the expected #images
>>>> loaded. Can I please request that you file a JIRA on this ?
>>>>
>>>
>>> How do I do this? Pointers?
>>>
>>
>> http://issues.apache.org
>>
>>
>>> I tried runs of 20mins to verify that longer runs will not make it better
>>> and it's still failing for just 50 users.
>>>
>>
>> What worries me is that you're saying it  fails for 1800 users too - I can
>> understand it may fail for 50 users, but if it fails for larger #users, then
>> it is a bug.
>>
>>>
>>>
>>
>>> and I do get the repetitive patterns you mentioned. However, the cache_MB
>>> though never exceeds 0.05...
>>> I would expect that memcache size is really important for the application
>>> scaling. What is the point of having a separate memcache server if we are
>>> only using less than 50KB(?) of memory for caching?
>>>
>>>
>> Try running without memcached - it can be easily configured in the app's
>> etc/config.php. Then you will see what different the cache makes. The
>> reduction in db traffic is dramatic resulting in the response times you see.
>> The reason the size is small is because we are currently only caching the
>> home page which is shared. We have not bothered to implement any additional
>> caching as this level of caching is sufficient to reduce the db load.
>>
>> Regards
>>> -VK
>>>
>>>  Shanti
>>
>>>
>>>
>>>> Shanti
>>>>
>>>>
>>>>> Thanks again
>>>>> -------------------------------------------------------------------
>>>>> Kontorinis Vasileios
>>>>> Phd student, University of California San Diego
>>>>> San Diego, CA 92122
>>>>> Cell. phone: (858) 717 6899
>>>>> bkontorinis@gmail.com, vkontori@ucsd.edu
>>>>> -------------------------------------------------------------------
>>>>>
>>>>>
>>>>> 2010/1/27 Shanti Subramanyam <shanti.subramanyam@gmail.com>
>>>>>
>>>>>> Yes - these are problems that I'm already aware of.
>>>>>> The best solution to the filestore issue is to change ownership of the
>>>>>> directory to the same user/group as the apache process. We could have the
>>>>>> fileloader.sh change write access I guess, but since that's a big security
>>>>>> hole, we may not want to do that automatically without letting the user know
>>>>>> about it.
>>>>>>
>>>>>> The fact that your response times are so high indicate that you're
>>>>>> running a far larger load than the system can handle and/or you still need
>>>>>> some tuning.
>>>>>> I suggest you start over from say 100 users and see at what point your
>>>>>> response times start getting really large. The apache error log should be
>>>>>> pulled in as part of the 'Statistics' tab, so do keep monitoring that.
>>>>>>
>>>>>> Shanti
>>>>>>
>>>>>>
>>>>>> On Wed, Jan 27, 2010 at 1:34 AM, Vasileios Kontorinis <
>>>>>> bkontorinis@gmail.com> wrote:
>>>>>>
>>>>>>> Shanti hi again,
>>>>>>>    I checked my apache logs and there were a bunch of errors.
>>>>>>> It looks like there some issues with the
>>>>>>> webapp/php/trunk/classes/ImageUtil.php in the last release of olio. (I
>>>>>>> downloaded
>>>>>>> http://www.alliedquotes.com/mirrors/apache/incubator/olio/0.2/apache-olio-php-src-0.2.tar.gz)
>>>>>>> 1) There is a line that needs to be commented. php complains ("1.5.
>>>>>>> Must be greater than zero.").
>>>>>>> 2) Then, it was complaining that it cannot find function
>>>>>>> fastimagecopyresampled . To work around that moved the function
>>>>>>> fastimagecopyresampled above createThumb (this might not  be required ) and
>>>>>>> deplared it static.
>>>>>>>     Finally,  I call the function from createThumb with
>>>>>>> self::fastimagecopyresampled .
>>>>>>> 3) Then, it started complaining because it could not write to the
>>>>>>> filestore. The problem is that wants to write the new images as www-data
>>>>>>> from the apache, while the filestore does not have write persmission for
>>>>>>> others. Manually,
>>>>>>>     giving access solves the problem (chmod -R o+w <path>/filestore)
>>>>>>> but since the directories in filestore are generated automatically, maybe
>>>>>>> the chmod command should be added in fileloader.sh
>>>>>>>
>>>>>>> Funnily enough, after fixing those issues, I still cannot pass the:
>>>>>>> Average images loaded per Home Page 2.65   >=3       FAILED
>>>>>>>
>>>>>>> and on top of that I also have:
>>>>>>> Response Times (secs)
>>>>>>> AddPerson     5.190  13.194  3.387 8.800     3.000 FAILED
>>>>>>> AddEvent       5.904  16.784  3.159 10.400   4.000 FAILED
>>>>>>>
>>>>>>> Think tims for AddPerson and AddEvent fail as well.
>>>>>>>
>>>>>>> Any insights are welcome .... :-(
>>>>>>>
>>>>>>> -------------------------------------------------------------------
>>>>>>> Kontorinis Vasileios
>>>>>>> Phd student, University of California San Diego
>>>>>>> San Diego, CA 92122
>>>>>>> Cell. phone: (858) 717 6899
>>>>>>> bkontorinis@gmail.com, vkontori@ucsd.edu
>>>>>>> -------------------------------------------------------------------
>>>>>>>
>>>>>>>
>>>>>>> 2010/1/26 Shanti Subramanyam <shanti.subramanyam@gmail.com>
>>>>>>>
>>>>>>>> Yes - 0.2 requires a lot more disk space as we changed the ratio of
>>>>>>>> concurrent users to registered users to 1:100. If you haven't already,
>>>>>>>> please check out our published Blueprints for detailed performance
>>>>>>>> characteristics of the workload:
>>>>>>>> Deploying Web 2.0 Applications on Sun Servers and the OpenSolaris
>>>>>>>> Operating System<http://wikis.sun.com/display/BluePrints/Deploying+Web+2.0+Applications+on+Sun+Servers+and+the+OpenSolaris+Operating+System>
>>>>>>>> <http://wikis.sun.com/display/BluePrints/Deploying+Web+2.0+Applications+on+Sun+Servers+and+the+OpenSolaris+Operating+System>
>>>>>>>> If you run for long enough, you should get passing runs. Have you
>>>>>>>> verified that there are no errors in the run logs when you see the 'Avg.
>>>>>>>> images loaded per home page' fail ?
>>>>>>>>
>>>>>>>> On to your open files error  - you may have to tune your networking
>>>>>>>> tier and/or #open file descriptors. I don't believe we have ever seen as
>>>>>>>> many files open as you are seeing. Can you determine whether these are from
>>>>>>>> the file store or network ? We also typically run the filestore on a
>>>>>>>> different system and nfs-mount it on the webserver box.
>>>>>>>> You will have to tune your system to ensure good performance since
>>>>>>>> you will need memory for both apache and files.
>>>>>>>>
>>>>>>>> Shanti
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Jan 25, 2010 at 5:06 PM, Vasileios Kontorinis <
>>>>>>>> bkontorinis@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Akara and Shanti hi,
>>>>>>>>>    I did migrate to Olio 0.2. With the last version of Olio I came
>>>>>>>>> across some new interesting things.
>>>>>>>>>
>>>>>>>>> Scaling issues:
>>>>>>>>>   - I am still getting the:
>>>>>>>>> Average images loaded per Home Page2.55>= 3
>>>>>>>>> FAILED
>>>>>>>>>  - additionally, when I scale the concurrent users to 800 I run out
>>>>>>>>> of diskspace since my filestore occupies more than 62GB.
>>>>>>>>> Actually for 600 users it occupies 50GB. I was curious if that
>>>>>>>>> makes sense. How much space I will need to reach 1000 users?
>>>>>>>>> In the php_setup.html it suggests that we will need 50GB but
>>>>>>>>> apparently we need way more for large number of users.
>>>>>>>>>
>>>>>>>>>  - Finally and most importantly, for 600 users many of the
>>>>>>>>> operations fail with the exception:
>>>>>>>>> Message: java.net.SocketException: Too many open files
>>>>>>>>> Stack Trace:
>>>>>>>>>  Class Method Line java.net.PlainSocketImpl socketAccept
>>>>>>>>> java.net.PlainSocketImpl accept 390 java.net.ServerSocket
>>>>>>>>> implAccept 453 java.net.ServerSocket accept 421
>>>>>>>>> sun.rmi.transport.tcp.TCPTransport$AcceptLoop executeAcceptLoop
>>>>>>>>> 369 sun.rmi.transport.tcp.TCPTransport$AcceptLoop run 341
>>>>>>>>> java.lang.Thread run 619
>>>>>>>>> or
>>>>>>>>>
>>>>>>>>> java.net.SocketException: Too many open files
>>>>>>>>> Stack Trace:
>>>>>>>>>  Class Method Line java.net.Socket createImpl 394 java.net.Socket
>>>>>>>>> getImpl 457 java.net.Socket bind 571
>>>>>>>>> com.sun.faban.driver.transport.hc3.ProtocolTimedSocketFactory
>>>>>>>>> createSocket 60 org.apache.commons.httpclient.HttpConnection open
>>>>>>>>> 707 org.apache.commons.httpclient.HttpMethodDirector
>>>>>>>>> executeWithRetry 387
>>>>>>>>> org.apache.commons.httpclient.HttpMethodDirector executeMethod 171
>>>>>>>>> org.apache.commons.httpclient.HttpClient executeMethod 397
>>>>>>>>> org.apache.commons.httpclient.HttpClient executeMethod 323
>>>>>>>>> com.sun.faban.driver.transport.hc3.ApacheHC3Transport readURL 274
>>>>>>>>> org.apache.olio.workload.driver.UIDriver doLogin 398
>>>>>>>>> org.apache.olio.workload.driver.UIDriver doLogin 424
>>>>>>>>> sun.reflect.GeneratedMethodAccessor8 invoke
>>>>>>>>> sun.reflect.DelegatingMethodAccessorImpl invoke 25
>>>>>>>>> java.lang.reflect.Method invoke 597
>>>>>>>>> com.sun.faban.driver.engine.TimeThread doRun 169
>>>>>>>>> com.sun.faban.driver.engine.AgentThrea
>>>>>>>>>
>>>>>>>>> I am monitoring the number of open files in the web-server with
>>>>>>>>> `watch "lsof | wc"` and the olio starts failing when around 65000-70,000
>>>>>>>>> files are open. lsof shows that for each apache2 thread there are around 100
>>>>>>>>> files open. Therefore there are around 650-700 different apache2 threads
>>>>>>>>> that create the bulk of those open file descriptors.
>>>>>>>>> The soft and hard limit is set to 403238, which means that there
>>>>>>>>> should be many more open files before it will start failing.
>>>>>>>>> (Actually, I verified the limit by opening a bunch of files with a
>>>>>>>>> python script and it does reach the limitation of 403238.)
>>>>>>>>> Any insights?  Is there any chance the the file descriptors take
>>>>>>>>> more time that usual to be reclaimed after being closed in the xen vm I use
>>>>>>>>> for my web-server? Does it make sense for olio at the first place to have so
>>>>>>>>> many files open at the same time?
>>>>>>>>>
>>>>>>>>> Thanks again.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> -------------------------------------------------------------------
>>>>>>>>> Kontorinis Vasileios
>>>>>>>>> Phd student, University of California San Diego
>>>>>>>>> San Diego, CA 92122
>>>>>>>>> Cell. phone: (858) 717 6899
>>>>>>>>> bkontorinis@gmail.com, vkontori@ucsd.edu
>>>>>>>>> -------------------------------------------------------------------
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2010/1/16 Shanti Subramanyam <shanti.subramanyam@gmail.com>
>>>>>>>>>
>>>>>>>>>  I would really recommend that you migrate to Olio 0.2. In addition
>>>>>>>>>> to bug fixes, there are some major features changes in it. See Olio
>>>>>>>>>> 0.2 released<http://perfwork.wordpress.com/2010/01/13/olio-0-2-relesed/>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Shanti
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Sat, Jan 16, 2010 at 4:49 PM, Vasileios Kontorinis <
>>>>>>>>>> bkontorinis@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Akara hi again,
>>>>>>>>>>>    Below I have comments on your suggestions and at the end some
>>>>>>>>>>> bonus questions... Thanks again.
>>>>>>>>>>>
>>>>>>>>>>> 2010/1/13 Akara Sucharitakul <Akara.Sucharitakul@sun.com>
>>>>>>>>>>>
>>>>>>>>>>>> With your permission, I'd like to copy the Olio and Faban user
>>>>>>>>>>>> aliases going forward. I feel it will help a much wider audience. Please see
>>>>>>>>>>>> below for answers/comments:
>>>>>>>>>>>>
>>>>>>>>>>>> Sure. I cced olio user alias. I am not sure which is the right
>>>>>>>>>>> faban list.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> Vasileios Kontorinis wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Akara hi,
>>>>>>>>>>>>>   I am a grad student at UCSD and I use Olio for a research
>>>>>>>>>>>>> project where we want to measure olio performance under live virtual machine
>>>>>>>>>>>>> migration. We use ubuntu 8.04 on nehalem servers.
>>>>>>>>>>>>> I have co ed the last version of olio from the online svn
>>>>>>>>>>>>> repository and downloaded the last version of faban (faban-kit-101509.tar.gz
>>>>>>>>>>>>> <http://faban.sunsource.net/nightly/faban-kit-101509.tar.gz>)
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> 101509 is fairly recent. But the latest on the web site is
>>>>>>>>>>>> 111109 (Faban 1.0). There were just bug fixes between those releases.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I have upgraded to Faban 1.0, still using olio1.0 though ( the
>>>>>>>>>>> release of 2.0 was announced, will switch to it if I run into bugs that have
>>>>>>>>>>> been fixed)
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> So far, I employed a bunch of hacks to get most of it to work
>>>>>>>>>>>>> and I am almost there. In the process I got a bunch of questions.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Questions (some of them might be just faban related, not olio
>>>>>>>>>>>>> so bear with me):
>>>>>>>>>>>>> 1) In there any way to deploy OlioDriver.jar through the
>>>>>>>>>>>>> command line? Firefox through ssh forwarding is dead slow and I d rather
>>>>>>>>>>>>> avoid if I can.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Just drop the jar into faban/benchmarks/ and it will deploy
>>>>>>>>>>>> itself. This is documented at
>>>>>>>>>>>> http://faban.sunsource.net/1.0/docs/guide/harnessdev/deploybenchmark.htmlunder "Alternate Deployment Methods."
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>  2) The services ApacheHttpdService, MemcachedService,
>>>>>>>>>>>>> MySQLService that come with Faban should be deployed before running Olio?
>>>>>>>>>>>>>    I was getting some very weird errors. e.g.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Yes, you should. Olio will search for those.
>>>>>>>>>>>>
>>>>>>>>>>>> Done
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> 03:50:27:WARNING:UIDriverAgent[0]: Forcefully terminating
>>>>>>>>>>>>> benchmark run
>>>>>>>>>>>>> 03:50:27:WARNING:UIDriverAgent[0]: 25 threads forcefully
>>>>>>>>>>>>> terminated.
>>>>>>>>>>>>> java.lang.Throwable: Stack of non-terminating thread.
>>>>>>>>>>>>>    at java.net.SocketInputStream.socketRead0 (null)
>>>>>>>>>>>>>    at java.net.SocketInputStream.read (129)
>>>>>>>>>>>>>    at java.io.FilterInputStream.read (116)
>>>>>>>>>>>>>    at com.sun.faban.driver.transport.util.TimedInputStream.read
>>>>>>>>>>>>> (139)
>>>>>>>>>>>>>    at java.io.BufferedInputStream.fill (218)
>>>>>>>>>>>>>    at java.io.BufferedInputStream.read (237)
>>>>>>>>>>>>>    at org.apache.commons.httpclient.HttpParser.readRawLine (78)
>>>>>>>>>>>>>    at org.apache.commons.httpclient.HttpParser.readLine (106)
>>>>>>>>>>>>>    at org.apache.commons.httpclient.HttpConnection.readLine
>>>>>>>>>>>>> (1116)
>>>>>>>>>>>>>    at
>>>>>>>>>>>>> org.apache.commons.httpclient.HttpMethodBase.readStatusLine (1973)
>>>>>>>>>>>>>    at org.apache.commons.httpclient.HttpMethodBase.readResponse
>>>>>>>>>>>>> (1735)
>>>>>>>>>>>>>    at org.apache.commons.httpclient.HttpMethodBase.execute
>>>>>>>>>>>>> (1098)
>>>>>>>>>>>>>    at
>>>>>>>>>>>>> org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry (398)
>>>>>>>>>>>>>    at
>>>>>>>>>>>>> org.apache.commons.httpclient.HttpMethodDirector.executeMethod (171)
>>>>>>>>>>>>>    at org.apache.commons.httpclient.HttpClient.executeMethod
>>>>>>>>>>>>> (397)
>>>>>>>>>>>>>    at org.apache.commons.httpclient.HttpClient.executeMethod
>>>>>>>>>>>>> (323)
>>>>>>>>>>>>>    at
>>>>>>>>>>>>> com.sun.faban.driver.transport.hc3.ApacheHC3Transport.fetchURL (529)
>>>>>>>>>>>>>    at
>>>>>>>>>>>>> com.sun.faban.driver.transport.hc3.ApacheHC3Transport.fetchURL (552)
>>>>>>>>>>>>>    at org.apache.olio.workload.driver.UIDriver.doHomePage (355)
>>>>>>>>>>>>>    at sun.reflect.NativeMethodAccessorImpl.invoke0 (null)
>>>>>>>>>>>>>    at sun.reflect.NativeMethodAccessorImpl.invoke (39)
>>>>>>>>>>>>>    at sun.reflect.DelegatingMethodAccessorImpl.invoke (25)
>>>>>>>>>>>>>    at java.lang.reflect.Method.invoke (597)
>>>>>>>>>>>>>    at com.sun.faban.driver.engine.TimeThread.doRun (169)
>>>>>>>>>>>>>    at com.sun.faban.driver.engine.AgentThread.run (202)
>>>>>>>>>>>>>
>>>>>>>>>>>>> and afterwards the master was waiting for threads to join for
>>>>>>>>>>>>> ever... (I attached gdb to verify that something was wrong) and hence I had
>>>>>>>>>>>>> to kill the benchmark.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> These threads are hanging reading the server responses, that
>>>>>>>>>>>> never came.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> Building the services from Faban probably fixes it.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> In the Olio log there are WARNINGS  complaining about not
>>>>>>>>>>>>> deploying those. After building those and manually copying them to
>>>>>>>>>>>>> /faban/services (ant deploy did not place them there... :-(  )
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Yes. But ant deploy should get them there. If not, can you
>>>>>>>>>>>> please let me know the ant messages?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Ant was deploying them indeed. I had a mistake in
>>>>>>>>>>> building.properties.
>>>>>>>>>>> I had:  faban.url=http://<hostname>:9980/   instead of
>>>>>>>>>>> faban.url=http://localhost:9980/
>>>>>>>>>>> After I changed that it started working...
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>  it worked. (mostly worked)
>>>>>>>>>>>>>
>>>>>>>>>>>>> 3) I still have warnings like:
>>>>>>>>>>>>> 01:38:08:INFO:Time difference to host olio-web is 269 ms.
>>>>>>>>>>>>> Attempting to set clock.
>>>>>>>>>>>>> 01:38:08:INFO:Time difference to host olio-db is 263 ms.
>>>>>>>>>>>>> Attempting to set clock.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> These two are OK. Just trying to do a clock sync between the
>>>>>>>>>>>> systems.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>  01:38:08:WARNING:olio-web wakeup-before time reached 700ms
>>>>>>>>>>>>> limit. System is too busy. Giving up.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> This is one of Faban's clock-setting calibrations. If the system
>>>>>>>>>>>> is too busy or you run on some virtualization architectures, the lag time
>>>>>>>>>>>> between an intended end of sleep and the actual time when the thread really
>>>>>>>>>>>> wakes up (gets scheduled/executed) is too high, calibrations will fail.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>  01:38:08:INFO:Time difference to host olio-mem is 262 ms.
>>>>>>>>>>>>> Attempting to set clock.
>>>>>>>>>>>>> 01:38:10:WARNING:olio-db wakeup-before time reached 700ms
>>>>>>>>>>>>> limit. System is too busy. Giving up.
>>>>>>>>>>>>> 09:38:09:WARNING:[date, -u, 011309382010.10]
>>>>>>>>>>>>> stderr:
>>>>>>>>>>>>> date: cannot set date: Operation not permitted
>>>>>>>>>>>>> 09:38:09:WARNING:Error on "[date, -u, 011309382010.10]" command
>>>>>>>>>>>>> trying to set the date. Exit value: 1
>>>>>>>>>>>>> 09:38:10:WARNING:[date, -u, 011309382010.11]
>>>>>>>>>>>>> stderr:
>>>>>>>>>>>>> date: cannot set date: Operation not permitted
>>>>>>>>>>>>> 09:38:10:WARNING:Error on "[date, -u, 011309382010.11]" command
>>>>>>>>>>>>> trying to set the date. Exit value: 1
>>>>>>>>>>>>> 09:38:09:WARNING:[date, -u, 011309382010.10]
>>>>>>>>>>>>> stderr:
>>>>>>>>>>>>> date: cannot set date: Operation not permitted
>>>>>>>>>>>>> 09:38:09:WARNING:Error on "[date, -u, 011309382010.10]" command
>>>>>>>>>>>>> trying to set the date. Exit value: 1
>>>>>>>>>>>>> 09:38:10:WARNING:[date, -u, 011309382010.11]
>>>>>>>>>>>>> stderr:
>>>>>>>>>>>>> date: cannot set date: Operation not permitted
>>>>>>>>>>>>>
>>>>>>>>>>>>> Leting faban change the vm clock sounds from the beginning a
>>>>>>>>>>>>> bad idea.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> OK. So it is xen. Yes, this is what Faban is trying to solve.
>>>>>>>>>>>> You can certainly turn it off. Please see:
>>>>>>>>>>>>   http://faban.sunsource.net/1.0/docs/howd services
>>>>>>>>>>>> ApacheHttpdService, MemcachedService, MySQLService that come with Faban
>>>>>>>>>>>> should be deployed before running Olio?
>>>>>>>>>>>>    I was gettingoi/physclocksync.html<http://faban.sunsource.net/1.0/docs/howdoi/physclocksync.html>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> I added the  <fh:timeSync>false<fh:timeSync> in my run.xml file
>>>>>>>>>>> ( btw in the link above there is a mistake :  <fh:timeSync>false
>>>>>>>>>>> </fh:timeSync> is correct, the second <fh:timeSync> needs a
>>>>>>>>>>> closing tag, the "/" is missing)
>>>>>>>>>>> that made the warnings go away.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>  Unfortunately, xen is really bad in maintaining an accurate
>>>>>>>>>>>>> clock. As a result there is usually time difference between the different
>>>>>>>>>>>>> virtual machines
>>>>>>>>>>>>> of more than 10ms. I went over the setTime function in Faban
>>>>>>>>>>>>> source (/faban/com/sun/faban/harness/agent/CmdAgentImpl.java), it's big and
>>>>>>>>>>>>> ugly (very ugly)
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks for the compliments! I think you mean
>>>>>>>>>>>> CmdService.setClockTask. Time sensitive code ain't pretty. It is the
>>>>>>>>>>>> complexities dealing with the clock and trying to achieve good accuracy. If
>>>>>>>>>>>> you think you can simplify this, I'm listening (without loosing the
>>>>>>>>>>>> accuracy, of course). In comparison, CmdAgentImpl has nothing.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> Yes, you r right it is CmdService.setClockTask. The previous
>>>>>>>>>>> email was composed at 3am ... :-)
>>>>>>>>>>> I am still a little confused.  the setClockTask is used to set
>>>>>>>>>>> the clock so that all the machines are synchronized with master. From what
>>>>>>>>>>> you mentioned the physical clock sync is only used for the logs.
>>>>>>>>>>> Why do we need to do that since 1) it requires root privileges
>>>>>>>>>>> (which might not be always available) 2) I could imagine an alternative that
>>>>>>>>>>> uses deltas from the actual physical clock without having to set it.
>>>>>>>>>>> ( I am probably missing something... :-)
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>  Why there is this strict requirement for 10ms difference? Any
>>>>>>>>>>>>> ideas?
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> It is easily achievable in most cases. May not be true for VMs.
>>>>>>>>>>>>
>>>>>>>>>>>> On some VM architectures, the OS however does not get scheduled
>>>>>>>>>>>> till way after that, thus causing problems. You may be able to measure
>>>>>>>>>>>> performance on those VMs. But you don't want to use such VMs to be a driver.
>>>>>>>>>>>> Your response time measurements will be way off.
>>>>>>>>>>>>
>>>>>>>>>>>> The physical clock sync is not really rigorous. And you can turn
>>>>>>>>>>>> it off. It is more to keep the systems in good time sync. If your VM stands
>>>>>>>>>>>> in the way, just turn it off. The driver's virtual clock sync is much more
>>>>>>>>>>>> picky in comparison. This is because the start time for the steady state
>>>>>>>>>>>> should be the same (with a very small tolerance) no matter how many drivers
>>>>>>>>>>>> are driving. Otherwise the measurement period won't be the same when viewed
>>>>>>>>>>>> from different drivers and the results won't be reliable.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>  Even with ntp it's hard to provide the 10ms guarantee.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> That's why we don't use ntp ;-)
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Just out of curiosity, the physical clocks are set only once at
>>>>>>>>>>> the beginning (right?), therefore for long runs the 10ms difference will not
>>>>>>>>>>> be guaranteed. Nope? Especially under VMs I 've seen significant clock
>>>>>>>>>>> difference withing a few minutes.
>>>>>>>>>>> At least ntp can periodically resync (of course doing so, might
>>>>>>>>>>> screw up the logs with time going backwards etc)
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>  I am thinking of modifying this function to always return that
>>>>>>>>>>>>> the time difference is less than 10ms (so that I do not have to wait all the
>>>>>>>>>>>>> time for the timeouts.)
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Why bother. Don't like it, just turn it off. It has good use in
>>>>>>>>>>>> most configurations we're dealing with. And, it avoids ntp inaccuracies.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>  Will this break anything in Olio?
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Nope. Except the times in your logs will appear out of sequence.
>>>>>>>>>>>> They rely on the local time on the originating systems.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> 4) Warning like:
>>>>>>>>>>>>> 09:39:48:WARNING:Image at
>>>>>>>>>>>>> http://olio-web:80/fileService.php?cache=false&file=e168t.jpg<
>>>>>>>>>>>>> http://olio-web:80/fileService.php?cache=false&file=e168t.jpg>
>>>>>>>>>>>>> size of 249 bytes is too small. Image may not exist
>>>>>>>>>>>>> can be ignored, right?
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Well, something is wrong. We don't have images that small. Check
>>>>>>>>>>>> whether e168t.jpg is really that small. That's why we have that warning.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> It kinda funny, my problem was that I had the olio webkit version
>>>>>>>>>>> installed and then I downloaded the version from the online svn repository.
>>>>>>>>>>> I built the driver but forgot to update the webpage for my apache server.
>>>>>>>>>>> Which
>>>>>>>>>>> as expected was the source for many of my issues.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> 5) Last and most important.
>>>>>>>>>>>>> I can run the benchmark and all the operation succeed but for
>>>>>>>>>>>>> login.
>>>>>>>>>>>>> I get a bunch of:
>>>>>>>>>>>>>
>>>>>>>>>>>>> 09:39:25:WARNING:UIDriverAgent[0].2.doLogin: Found login prompt
>>>>>>>>>>>>> at index 2926, Login as at786o08x, 2178 failed.
>>>>>>>>>>>>> Note: Error not counted in result.
>>>>>>>>>>>>> Either transaction start or end time is not within steady
>>>>>>>>>>>>> state.
>>>>>>>>>>>>> java.lang.RuntimeException: Found login prompt at index 2926,
>>>>>>>>>>>>> Login as at786o08x, 2178 failed.
>>>>>>>>>>>>>    at org.apache.olio.workload.driver.UIDriver.doLogin (404)
>>>>>>>>>>>>>    at sun.reflect.NativeMethodAccessorImpl.invoke0 (null)
>>>>>>>>>>>>>    at sun.reflect.NativeMethodAccessorImpl.invoke (39)
>>>>>>>>>>>>>    at sun.reflect.DelegatingMethodAccessorImpl.invoke (25)
>>>>>>>>>>>>>    at java.lang.reflect.Method.invoke (597)
>>>>>>>>>>>>>    at com.sun.faban.driver.engine.TimeThread.doRun (169)
>>>>>>>>>>>>>    at com.sun.faban.driver.engine.AgentThread.run (202)
>>>>>>>>>>>>>
>>>>>>>>>>>>> Any ideas? I do get
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> You likely have cookie issues. It can't seem to hold on to a
>>>>>>>>>>>> session.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> Well there was a permission issue with the http_session dir. I
>>>>>>>>>>> could not right to it. chmod 777 it fixed this.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> (I ve found online:
>>>>>>>>>>>>> http://www.mail-archive.com/olio-dev@incubator.apache.org/msg00647.htmlwhich is similar, but when I added
>>>>>>>>>>>>>
>>>>>>>>>>>>> com.sun.faban.driver.transport.sunhttp.ThreadCookieHandler.level=FINER
>>>>>>>>>>>>>  in build.properties
>>>>>>>>>>>>> I did not see any cookie related warnings. Those should appear
>>>>>>>>>>>>> in the olio run log or the apache log, right? Am i just looking at the wrong
>>>>>>>>>>>>> place? )
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Yes, that's applicable only to the Sun Http Transport. The
>>>>>>>>>>>> version of Olio you're using is based on the Apache Http Transport (Apache
>>>>>>>>>>>> HttpClient 3.1). The ThreadCookieHandler is not used for the Apache
>>>>>>>>>>>> transport and that's why you don't see any logs. Try upgrade to Faban 1.0
>>>>>>>>>>>> before looking at other things.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> It's a long email I know. Your feedback would be most
>>>>>>>>>>>>> appreciated.
>>>>>>>>>>>>>
>>>>>>>>>>>>> -Regards
>>>>>>>>>>>>>
>>>>>>>>>>>>> -------------------------------------------------------------------
>>>>>>>>>>>>> Kontorinis Vasileios
>>>>>>>>>>>>> Phd student, University of California San Diego
>>>>>>>>>>>>> San Diego, CA 92122
>>>>>>>>>>>>> Cell. phone: (858) 717 6899
>>>>>>>>>>>>> bkontorinis@gmail.com <mailto:bkontorinis@gmail.com>,
>>>>>>>>>>>>> vkontori@ucsd.edu <mailto:vkontori@ucsd.edu>
>>>>>>>>>>>>>
>>>>>>>>>>>>> -------------------------------------------------------------------\
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks for all the questions/comments.
>>>>>>>>>>>>
>>>>>>>>>>>> -Akara
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> And now some more questions/ comments:
>>>>>>>>>>> 1) I get the following error:
>>>>>>>>>>>
>>>>>>>>>>> 15:13:05:SEVERE:CmdService: Getting - exception reading
>>>>>>>>>>> /usr/data/olio-db.err
>>>>>>>>>>> java.io.FileNotFoundException: File /usr/data/olio-db.err does
>>>>>>>>>>> not exist.
>>>>>>>>>>>     at com.sun.faban.common.FileTransfer.<init> (70)
>>>>>>>>>>>     at com.sun.faban.harness.agent.FileAgentImpl.get (315)
>>>>>>>>>>>     at sun.reflect.NativeMethodAccessorImpl.invoke0 (null)
>>>>>>>>>>>     at sun.reflect.NativeMethodAccessorImpl.invoke (39)
>>>>>>>>>>>     at sun.reflect.DelegatingMethodAccessorImpl.invoke (25)
>>>>>>>>>>>     at java.lang.reflect.Method.invoke (597)
>>>>>>>>>>>     at sun.rmi.server.UnicastServerRef.dispatch (305)
>>>>>>>>>>>     at sun.rmi.transport.Transport$1.run (159)
>>>>>>>>>>>     at java.security.AccessController.doPrivileged (null)
>>>>>>>>>>>     at sun.rmi.transport.Transport.serviceCall (155)
>>>>>>>>>>>     at sun.rmi.transport.tcp.TCPTransport.handleMessages (535)
>>>>>>>>>>>     at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0
>>>>>>>>>>> (790)
>>>>>>>>>>>     at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run
>>>>>>>>>>> (649)
>>>>>>>>>>>     at java.util.concurrent.ThreadPoolExecutor$Worker.runTask
>>>>>>>>>>> (885)
>>>>>>>>>>>     at java.util.concurrent.ThreadPoolExecutor$Worker.run (907)
>>>>>>>>>>>     at java.lang.Thread.run (619)
>>>>>>>>>>>     at
>>>>>>>>>>> sun.rmi.transport.StreamRemoteCall.exceptionReceivedFromServer (255)
>>>>>>>>>>>     at sun.rmi.transport.StreamRemoteCall.executeCall (233)
>>>>>>>>>>>     at sun.rmi.server.UnicastRef.invoke (142)
>>>>>>>>>>>     at com.sun.faban.harness.agent.FileAgentImpl_Stub.get (null)
>>>>>>>>>>>     at com.sun.faban.harness.engine.CmdService.get (1334)
>>>>>>>>>>>     at com.sun.faban.harness.RunContext.getFile (346)
>>>>>>>>>>>     at com.sun.services.MySQLService.getLogs (197)
>>>>>>>>>>>     at sun.reflect.NativeMethodAccessorImpl.invoke0 (null)
>>>>>>>>>>>     at sun.reflect.NativeMethodAccessorImpl.invoke (39)
>>>>>>>>>>>     at sun.reflect.DelegatingMethodAccessorImpl.invoke (25)
>>>>>>>>>>>     at java.lang.reflect.Method.invoke (597)
>>>>>>>>>>>     at com.sun.faban.harness.util.Invoker.invoke (98)
>>>>>>>>>>>     at com.sun.faban.harness.services.ServiceWrapper.getLogs
>>>>>>>>>>> (200)
>>>>>>>>>>>     at com.sun.faban.harness.services.ServiceManager.getLogs
>>>>>>>>>>> (642)
>>>>>>>>>>>     at com.sun.faban.harness.engine.GenericBenchmark.start (323)
>>>>>>>>>>>     at com.sun.faban.harness.engine.RunDaemon.run (338)
>>>>>>>>>>>     at java.lang.Thread.run (619)
>>>>>>>>>>> 15:13:05:WARNING:Could not copy /usr/data/olio-db.err to
>>>>>>>>>>> /home/gdhiman/faban.1.0/faban/output/OlioDriver.2D/mysql_err.log.olio-db
>>>>>>>>>>>
>>>>>>>>>>> Apparently something is misconfigured in my db-server. Any ideas?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 2) I get the following error:
>>>>>>>>>>> 15:13:16:WARNING:[/home/gdhiman/faban.1.0/faban/bin/fenxi,
>>>>>>>>>>> process, /home/gdhiman/faban.1.0/faban/output/OlioDriver.2D/,
>>>>>>>>>>> /home/gdhiman/faban.1.0/faban/output/OlioDriver.2D//post/, OlioDriver.2D]
>>>>>>>>>>> stderr:
>>>>>>>>>>> Error in executing perl
>>>>>>>>>>> /home/gdhiman/faban.1.0/faban/master/webapps/fenxi/txt2db/
>>>>>>>>>>> mpstat.pl
>>>>>>>>>>> Error in executing perl
>>>>>>>>>>> /home/gdhiman/faban.1.0/faban/master/webapps/fenxi/txt2db/
>>>>>>>>>>> mpstat.pl
>>>>>>>>>>>
>>>>>>>>>>> Actually I traced back this one. The problem is the difference in
>>>>>>>>>>> output format of the Sun's mpstat and default GNU mpstat.
>>>>>>>>>>> This is my output of my mpstat:
>>>>>>>>>>>
>>>>>>>>>>> gdhiman@olio-client00:~/faban.1.0/faban/output/OlioDriver.2D$
>>>>>>>>>>> mpstat 1
>>>>>>>>>>> Linux 2.6.18.8-xen (olio-client00)     01/16/10
>>>>>>>>>>>
>>>>>>>>>>> 16:25:06     CPU   %user   %nice    %sys %iowait    %irq   %soft
>>>>>>>>>>> %steal   %idle    intr/s
>>>>>>>>>>> 16:25:07     all    0.00    0.00    0.00    0.00    0.00
>>>>>>>>>>> 0.00    0.00  100.00     52.48
>>>>>>>>>>> 16:25:08     all    0.00    0.00    0.00    0.00    0.00
>>>>>>>>>>> 0.00    0.00  100.00     50.50
>>>>>>>>>>> 16:25:09     all    0.00    0.00    0.00    0.00    0.00
>>>>>>>>>>> 0.00    0.00  100.00     79.21
>>>>>>>>>>> 16:25:10     all    0.00    0.00    0.00    0.00    0.00
>>>>>>>>>>> 0.00    0.00  100.00     45.54
>>>>>>>>>>> 16:25:11     all    0.00    0.00    0.00    0.00    0.00
>>>>>>>>>>> 0.00    0.00  100.00     55.45
>>>>>>>>>>>
>>>>>>>>>>> The first line as well as the time at the beginning of each entry
>>>>>>>>>>> messing up the parsing at mpstat.pl. (also the fields are
>>>>>>>>>>> different)   Any plans to support this??
>>>>>>>>>>>
>>>>>>>>>>> 3) Scaling questions.
>>>>>>>>>>> - So far I did not have a single experiment passing. Some are
>>>>>>>>>>> pretty close with only one metric check failing.
>>>>>>>>>>>
>>>>>>>>>>> Average images loaded per Home Page2.79>= 3
>>>>>>>>>>> FAILED
>>>>>>>>>>> Any ideas? Is it the case that the disc is not fast enough? I am
>>>>>>>>>>> just using the local filesystem for the filestore.
>>>>>>>>>>>
>>>>>>>>>>> - As I double the number of concurrent users I observe linear
>>>>>>>>>>> scaling in the thoughput.
>>>>>>>>>>> Con Users         Throughput
>>>>>>>>>>>  25                        4.967
>>>>>>>>>>>  50                       10.06
>>>>>>>>>>> 100                      19.375
>>>>>>>>>>> 200                      40.21
>>>>>>>>>>> 400                      75.818
>>>>>>>>>>> 800                       0.383
>>>>>>>>>>> 1000                     0.483
>>>>>>>>>>>
>>>>>>>>>>> The linear scaling stops for 400 concurrent users ( only one
>>>>>>>>>>> agent). Actually it would be exactly linear (value of ~80) but almost half
>>>>>>>>>>> of the login operations failed. I am looking into it.
>>>>>>>>>>> Any insights on what might be the first thing failing?
>>>>>>>>>>>
>>>>>>>>>>> For the 800 and 1000 experiments there are no failed operations
>>>>>>>>>>> logged. It looks like those are being discarded... (?)
>>>>>>>>>>>
>>>>>>>>>>> Bonus question:
>>>>>>>>>>> In the runtime statistics
>>>>>>>>>>> <runtimeStats enabled="true">
>>>>>>>>>>>          <interval>30</interval>
>>>>>>>>>>>  </runtimeStats>
>>>>>>>>>>>
>>>>>>>>>>> only the 90% response time is reported. Is there an easy way to
>>>>>>>>>>> also report the 99% ? ( or I need to add code for that?)
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks a lot again in advance.
>>>>>>>>>>> -VK
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

--00504502e31a533104047f6fa728
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

If you want to run multiple webservers on different systems, you must have =
access to the filestore from all of them. The easiest way to do this is to =
nfs-mount the filestore from the server it resides on so it is accessible t=
o the other machines as well.<div>
<br></div><div>Shanti<br><br><div class=3D"gmail_quote">On Thu, Feb 11, 201=
0 at 9:42 PM, Vasileios Kontorinis <span dir=3D"ltr">&lt;<a href=3D"mailto:=
bkontorinis@gmail.com">bkontorinis@gmail.com</a>&gt;</span> wrote:<br><bloc=
kquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #cc=
c solid;padding-left:1ex;">
Shanti hi again, <br>=A0=A0 Sorry for not submitting the JIRA on time, I am=
 extremely busy lately. <br><br>I have a fast question regarding the way th=
e webserver interacts with the filestore. I run some scaling studies with o=
ne, two and three different server while having only one filestore (I do sp=
ecify that in the run.xml configuration file, webServer and dataStorage ). =
<br>


The filestore is a local folder on one of the server machines. However, in =
the oliophp/etc/config.php I also specify on each server<br><br>$olioconfig=
[&#39;fileSystem&#39;] =3D &#39;LocalFS&#39;;<br>$olioconfig[&#39;localfsRo=
ot&#39;] =3D &#39;/home/gdhiman/filestore&#39;;<br>


<br><div>As a result, I do get WARNINGS for missing files on the webserver =
that do not host a filestore. What is the right configuration for oliophp/e=
tc/config.php? Can I somehow detach the filestore from the webserver so tha=
t it requests files remotely?<div class=3D"im">
<br>

<br>Thanks again.<br clear=3D"all">

-------------------------------------------------------------------<br>Kont=
orinis Vasileios<br>Phd student, University of California San Diego<br>San =
Diego, CA 92122<br>Cell. phone: (858) 717 6899<br><a href=3D"mailto:bkontor=
inis@gmail.com" target=3D"_blank">bkontorinis@gmail.com</a>, <a href=3D"mai=
lto:vkontori@ucsd.edu" target=3D"_blank">vkontori@ucsd.edu</a><br>


-------------------------------------------------------------------<br>
<br><br></div><div class=3D"gmail_quote">2010/2/8 Shanti Subramanyam <span =
dir=3D"ltr">&lt;<a href=3D"mailto:shanti.subramanyam@gmail.com" target=3D"_=
blank">shanti.subramanyam@gmail.com</a>&gt;</span><div><div></div><div clas=
s=3D"h5">
<br><blockquote class=3D"gmail_quote" style=3D"border-left:1px solid rgb(20=
4, 204, 204);margin:0pt 0pt 0pt 0.8ex;padding-left:1ex">


<br><br><div class=3D"gmail_quote"><div>On Mon, Feb 8, 2010 at 3:53 PM, Vas=
ileios Kontorinis <span dir=3D"ltr">&lt;<a href=3D"mailto:bkontorinis@gmail=
.com" target=3D"_blank">bkontorinis@gmail.com</a>&gt;</span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"border-left:1px solid rgb(204, 2=
04, 204);margin:0pt 0pt 0pt 0.8ex;padding-left:1ex">
<br><div class=3D"gmail_quote"><div><blockquote class=3D"gmail_quote" style=
=3D"border-left:1px solid rgb(204, 204, 204);margin:0pt 0pt 0pt 0.8ex;paddi=
ng-left:1ex"><div class=3D"gmail_quote"><div><br>

We need to look into this issue=A0 - I suspect that something subtle has ch=
anged in 0.2 which hasn&#39;t got accounted for in the expected #images loa=
ded. Can I please request that you file a JIRA on this ? <br></div></div>


</blockquote></div><div><br>How do I do this? Pointers?<br></div></div></bl=
ockquote><div><br></div></div><div><a href=3D"http://issues.apache.org" tar=
get=3D"_blank">http://issues.apache.org</a></div><div><div>=A0</div>
<blockquote class=3D"gmail_quote" style=3D"border-left:1px solid rgb(204, 2=
04, 204);margin:0pt 0pt 0pt 0.8ex;padding-left:1ex">
<div class=3D"gmail_quote"><div>I tried runs of 20mins to verify that longe=
r runs will not make it better and it&#39;s still failing for just 50 users=
. </div></div></blockquote><div><br></div></div><div>What worries me is tha=
t you&#39;re saying it =A0fails for 1800 users too - I can understand it ma=
y fail for 50 users, but if it fails for larger #users, then it is a bug.=
=A0</div>


<div>
<blockquote class=3D"gmail_quote" style=3D"border-left:1px solid rgb(204, 2=
04, 204);margin:0pt 0pt 0pt 0.8ex;padding-left:1ex"><div class=3D"gmail_quo=
te"><div>=A0</div></div></blockquote><blockquote class=3D"gmail_quote" styl=
e=3D"border-left:1px solid rgb(204, 204, 204);margin:0pt 0pt 0pt 0.8ex;padd=
ing-left:1ex">


<div class=3D"gmail_quote"><div>=A0</div><div>and I do get the repetitive p=
atterns you mentioned. However, the cache_MB though never exceeds 0.05...<b=
r>I would expect that memcache size is really important for the application=
 scaling. What is the point of having a separate memcache server if we are =
only using less than 50KB(?) of memory for caching?<br>


<br></div></div></blockquote><div><br></div></div><div>Try running without =
memcached - it can be easily configured in the app&#39;s etc/config.php. Th=
en you will see what different the cache makes. The reduction in db traffic=
 is dramatic resulting in the response times you see. The reason the size i=
s small is because we are currently only caching the home page which is sha=
red. We have not bothered to implement any additional caching as this level=
 of caching is sufficient to reduce the db load.</div>


<div><br></div><blockquote class=3D"gmail_quote" style=3D"border-left:1px s=
olid rgb(204, 204, 204);margin:0pt 0pt 0pt 0.8ex;padding-left:1ex"><div cla=
ss=3D"gmail_quote"><div>Regards<br><font color=3D"#888888">-VK<br><br></fon=
t></div>


</div></blockquote>
<font color=3D"#888888">
<div>Shanti=A0</div></font><div><div></div><div><blockquote class=3D"gmail_=
quote" style=3D"border-left:1px solid rgb(204, 204, 204);margin:0pt 0pt 0pt=
 0.8ex;padding-left:1ex"><div class=3D"gmail_quote"><div><font color=3D"#88=
8888">=A0</font></div>


<div><div></div><div><blockquote class=3D"gmail_quote" style=3D"border-left=
:1px solid rgb(204, 204, 204);margin:0pt 0pt 0pt 0.8ex;padding-left:1ex">
<div class=3D"gmail_quote"><div><font color=3D"#888888">Shanti<br>

<br></font></div><div><div></div><div><blockquote class=3D"gmail_quote" sty=
le=3D"border-left:1px solid rgb(204, 204, 204);margin:0pt 0pt 0pt 0.8ex;pad=
ding-left:1ex"><div><br></div><div>

Thanks again</div><div><div><div><div>-------------------------------------=
------------------------------<br>Kontorinis Vasileios<br>Phd student, Univ=
ersity of California San Diego<br>
San Diego, CA 92122<br>Cell. phone: (858) 717 6899<br><a href=3D"mailto:bko=
ntorinis@gmail.com" target=3D"_blank">bkontorinis@gmail.com</a>, <a href=3D=
"mailto:vkontori@ucsd.edu" target=3D"_blank">vkontori@ucsd.edu</a><br>-----=
--------------------------------------------------------------<br>


<br><br><div class=3D"gmail_quote">2010/1/27 Shanti Subramanyam <span dir=
=3D"ltr">&lt;<a href=3D"mailto:shanti.subramanyam@gmail.com" target=3D"_bla=
nk">shanti.subramanyam@gmail.com</a>&gt;</span><br><blockquote class=3D"gma=
il_quote" style=3D"border-left:1px solid rgb(204, 204, 204);margin:0pt 0pt =
0pt 0.8ex;padding-left:1ex">


Yes - these are problems that I&#39;m already aware of.<div>The best soluti=
on to the filestore issue is to change ownership of the directory to the sa=
me user/group as the apache process. We could have the fileloader.sh change=
 write access I guess, but since that&#39;s a big security hole, we may not=
 want to do that automatically without letting the user know about it.</div=
>


<div><br></div><div>The fact that your response times are so high indicate =
that you&#39;re running a far larger load than the system can handle and/or=
 you still need some tuning.=A0</div><div>I suggest you start over from say=
 100 users and see at what point your response times start getting really l=
arge. The apache error log should be pulled in as part of the &#39;Statisti=
cs&#39; tab, so do keep monitoring that.</div>


<div><br></div><div><font color=3D"#888888">Shanti</font><div><div></div><d=
iv><br><br><div class=3D"gmail_quote">On Wed, Jan 27, 2010 at 1:34 AM, Vasi=
leios Kontorinis <span dir=3D"ltr">&lt;<a href=3D"mailto:bkontorinis@gmail.=
com" target=3D"_blank">bkontorinis@gmail.com</a>&gt;</span> wrote:<br>


<blockquote class=3D"gmail_quote" style=3D"border-left:1px solid rgb(204, 2=
04, 204);margin:0pt 0pt 0pt 0.8ex;padding-left:1ex">
Shanti hi again,<br>=A0=A0 I checked my apache logs and there were a bunch =
of errors.<br>It looks like there some issues with the webapp/php/trunk/cla=
sses/ImageUtil.php in the last release of olio. (I downloaded <a href=3D"ht=
tp://www.alliedquotes.com/mirrors/apache/incubator/olio/0.2/apache-olio-php=
-src-0.2.tar.gz" target=3D"_blank">http://www.alliedquotes.com/mirrors/apac=
he/incubator/olio/0.2/apache-olio-php-src-0.2.tar.gz</a> )<br>


1) There is a line that needs to be commented. php complains (&quot;1.5. Mu=
st be greater than zero.&quot;). <br>2) Then, it was complaining that it ca=
nnot find function fastimagecopyresampled . To work around that moved the f=
unction fastimagecopyresampled above createThumb (this might not=A0 be requ=
ired ) and deplared it static. <br>


=A0=A0=A0 Finally,=A0 I call the function from createThumb with self::fasti=
magecopyresampled . <br>3) Then, it started complaining because it could no=
t write to the filestore. The problem is that wants to write the new images=
 as www-data from the apache, while the filestore does not have write persm=
ission for others. Manually, <br>


=A0=A0=A0 giving access solves the problem (chmod -R o+w &lt;path&gt;/files=
tore) but since the directories in filestore are generated automatically, m=
aybe the chmod command should be added in fileloader.sh<br><br>Funnily enou=
gh, after fixing those issues, I still cannot pass the:<br>


Average images loaded per Home Page 2.65=A0=A0 &gt;=3D3=A0=A0=A0=A0=A0=A0 F=
AILED<br><br>and on top of that I also have:<br>Response Times (secs)<br>Ad=
dPerson=A0=A0=A0=A0 5.190=A0 13.194=A0 3.387 8.800=A0=A0=A0=A0 3.000 FAILED=
<br>AddEvent=A0=A0=A0=A0=A0=A0 5.904=A0 16.784=A0 3.159 10.400=A0=A0 4.000 =
FAILED<br>


<br>Think tims for AddPerson and AddEvent fail as well.<br><br>Any insights=
 are welcome .... :-(<br><br clear=3D"all">--------------------------------=
-----------------------------------<br>Kontorinis Vasileios<br>Phd student,=
 University of California San Diego<br>


San Diego, CA 92122<br>Cell. phone: (858) 717 6899<br><a href=3D"mailto:bko=
ntorinis@gmail.com" target=3D"_blank">bkontorinis@gmail.com</a>, <a href=3D=
"mailto:vkontori@ucsd.edu" target=3D"_blank">vkontori@ucsd.edu</a><br>-----=
--------------------------------------------------------------<br>


<br><br><div class=3D"gmail_quote">2010/1/26 Shanti Subramanyam <span dir=
=3D"ltr">&lt;<a href=3D"mailto:shanti.subramanyam@gmail.com" target=3D"_bla=
nk">shanti.subramanyam@gmail.com</a>&gt;</span><br><blockquote class=3D"gma=
il_quote" style=3D"border-left:1px solid rgb(204, 204, 204);margin:0pt 0pt =
0pt 0.8ex;padding-left:1ex">


Yes - 0.2 requires a lot more disk space as we changed the ratio of concurr=
ent users to registered users to 1:100. If you haven&#39;t already, please =
check out our published Blueprints for detailed performance characteristics=
 of the workload:<div>


<a href=3D"http://wikis.sun.com/display/BluePrints/Deploying+Web+2.0+Applic=
ations+on+Sun+Servers+and+the+OpenSolaris+Operating+System" target=3D"_blan=
k">Deploying Web 2.0 Applications on Sun Servers and the OpenSolaris Operat=
ing System</a></div>


<div><a href=3D"http://wikis.sun.com/display/BluePrints/Deploying+Web+2.0+A=
pplications+on+Sun+Servers+and+the+OpenSolaris+Operating+System" target=3D"=
_blank"></a><br><div>If you run for long enough, you should get passing run=
s. Have you verified that there are no errors in the run logs when you see =
the &#39;Avg. images loaded per home page&#39; fail ?=A0</div>


<div><br></div><div>On to your open files error =A0- you may have to tune y=
our networking tier and/or #open file descriptors. I don&#39;t believe we h=
ave ever seen as many files open as you are seeing. Can you determine wheth=
er these are from the file store or network ? We also typically run the fil=
estore on a different system and nfs-mount it on the webserver box.</div>


<div>You will have to tune your system to ensure good performance since you=
 will need memory for both apache and files.=A0</div><div><br></div><div><f=
ont color=3D"#888888">Shanti</font><div><div></div><div><br><br>

<div class=3D"gmail_quote">On Mon, Jan 25, 2010 at 5:06 PM, Vasileios Konto=
rinis <span dir=3D"ltr">&lt;<a href=3D"mailto:bkontorinis@gmail.com" target=
=3D"_blank">bkontorinis@gmail.com</a>&gt;</span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"border-left:1px solid rgb(204, 2=
04, 204);margin:0pt 0pt 0pt 0.8ex;padding-left:1ex">Akara and Shanti hi,<br=
>=A0=A0 I did migrate to Olio 0.2. With the last version of Olio I came acr=
oss some new interesting things. <br>


<br>Scaling issues:<br>=A0 - I am still getting the:<br><div></div><table s=
tyle=3D"border:2px solid rgb(204, 204, 204);padding:2px;text-align:center;w=
idth:100%" border=3D"0" cellpadding=3D"4" cellspacing=3D"3">

<tbody><tr><td style=3D"text-align:left">Average images loaded per Home Pag=
e</td><td>2.55</td><td>&gt;=3D 3</td><td><br></td><td style=3D"color:rgb(25=
5, 0, 0)">

FAILED</td></tr></tbody></table><br>=A0- additionally, when I scale the con=
current users to 800 I run out of diskspace since my filestore occupies mor=
e than 62GB.<br>Actually for 600 users it occupies 50GB. I was curious if t=
hat makes sense. How much space I will need to reach 1000 users?<br>


In the php_setup.html it suggests that we will need 50GB but apparently we =
need way more for large number of users. <br><br>=A0- Finally and most impo=
rtantly, for 600 users many of the operations fail with the exception:<br>


<span style=3D"font-weight:bold">Message:</span>
<table style=3D"border:2px solid rgb(204, 204, 204);padding:2px;text-align:=
left;width:100%" border=3D"0" cellpadding=3D"4" cellspacing=3D"3">
<tbody><tr><td>java.net.SocketException: Too many open files</td></tr></tbo=
dy></table>
<br><span style=3D"font-weight:bold">Stack Trace:</span><div>
</div><table style=3D"border:2px solid rgb(204, 204, 204);padding:2px;text-=
align:center;width:100%" border=3D"0" cellpadding=3D"4" cellspacing=3D"3"><=
tbody><tr>
<th style=3D"text-align:left">Class</th>
<th>Method</th>
<th>Line</th></tr>
<tr><td style=3D"text-align:left">java.net.PlainSocketImpl</td>
<td>socketAccept</td>
<td>=A0</td></tr>
<tr><td style=3D"text-align:left">java.net.PlainSocketImpl</td>
<td>accept</td>
<td>390</td></tr>
<tr><td style=3D"text-align:left">java.net.ServerSocket</td>
<td>implAccept</td>
<td>453</td></tr>
<tr><td style=3D"text-align:left">java.net.ServerSocket</td>
<td>accept</td>
<td>421</td></tr>
<tr><td style=3D"text-align:left">sun.rmi.transport.tcp.TCPTransport$Accept=
Loop</td>
<td>executeAcceptLoop</td>
<td>369</td></tr>
<tr><td style=3D"text-align:left">sun.rmi.transport.tcp.TCPTransport$Accept=
Loop</td>
<td>run</td>
<td>341</td></tr><tr><td style=3D"text-align:left">java.lang.Thread</td>
<td>run</td>
<td>619</td></tr></tbody></table><br>or<br><br>java.net.SocketException: To=
o many open files
<br><span style=3D"font-weight:bold">Stack Trace:</span><div></div><div></d=
iv><div>
</div><div></div><div></div><div></div><div></div><div></div><div>
</div><div></div><div></div><table style=3D"border:2px solid rgb(204, 204, =
204);padding:2px;text-align:center;width:100%" border=3D"0" cellpadding=3D"=
4" cellspacing=3D"3"><tbody><tr>
<th style=3D"text-align:left">Class</th>
<th>Method</th>
<th>Line</th></tr>
<tr><td style=3D"text-align:left">java.net.Socket</td>
<td>createImpl</td>
<td>394</td></tr>
<tr><td style=3D"text-align:left">java.net.Socket</td>
<td>getImpl</td>
<td>457</td></tr>
<tr><td style=3D"text-align:left">java.net.Socket</td>
<td>bind</td>
<td>571</td></tr>
<tr><td style=3D"text-align:left">com.sun.faban.driver.transport.hc3.Protoc=
olTimedSocketFactory</td>
<td>createSocket</td>
<td>60</td></tr>
<tr><td style=3D"text-align:left">org.apache.commons.httpclient.HttpConnect=
ion</td>
<td>open</td>
<td>707</td></tr>
<tr><td style=3D"text-align:left">org.apache.commons.httpclient.HttpMethodD=
irector</td>
<td>executeWithRetry</td>
<td>387</td></tr><tr><td style=3D"text-align:left">org.apache.commons.httpc=
lient.HttpMethodDirector</td>
<td>executeMethod</td>
<td>171</td></tr>
<tr><td style=3D"text-align:left">org.apache.commons.httpclient.HttpClient<=
/td>
<td>executeMethod</td>
<td>397</td></tr>
<tr><td style=3D"text-align:left">org.apache.commons.httpclient.HttpClient<=
/td>
<td>executeMethod</td>
<td>323</td></tr>
<tr><td style=3D"text-align:left">com.sun.faban.driver.transport.hc3.Apache=
HC3Transport</td>
<td>readURL</td>
<td>274</td></tr>
<tr><td style=3D"text-align:left">org.apache.olio.workload.driver.UIDriver<=
/td>
<td>doLogin</td>
<td>398</td></tr>
<tr><td style=3D"text-align:left">org.apache.olio.workload.driver.UIDriver<=
/td>
<td>doLogin</td>
<td>424</td></tr>
<tr><td style=3D"text-align:left">sun.reflect.GeneratedMethodAccessor8</td>
<td>invoke</td>
<td>=A0</td></tr><tr><td style=3D"text-align:left">sun.reflect.DelegatingMe=
thodAccessorImpl</td>
<td>invoke</td>
<td>25</td></tr>
<tr><td style=3D"text-align:left">java.lang.reflect.Method</td>
<td>invoke</td>
<td>597</td></tr>
<tr><td style=3D"text-align:left">com.sun.faban.driver.engine.TimeThread</t=
d>
<td>doRun</td>
<td>169</td></tr>
<tr><td style=3D"text-align:left">com.sun.faban.driver.engine.AgentThrea</t=
d></tr></tbody></table><br><br>I am monitoring the number of open files in =
the web-server with=A0=A0 `watch &quot;lsof | wc&quot;` and the olio starts=
 failing when around 65000-70,000 files are open. lsof shows that for each =
apache2 thread there are around 100 files open. Therefore there are around =
650-700 different apache2 threads that create the bulk of those open file d=
escriptors.<br>


The soft and hard limit is set to 403238, which means that there should be =
many more open files before it will start failing. <br>(Actually, I verifie=
d the limit by opening a bunch of files with a python script and it does re=
ach the limitation of 403238.)<br>


Any insights?=A0 Is there any chance the the file descriptors take more tim=
e that usual to be reclaimed after being closed in the xen vm I use for my =
web-server? Does it make sense for olio at the first place to have so many =
files open at the same time?<br>


<br>Thanks again.<div><br><br clear=3D"all">-------------------------------=
------------------------------------<br>Kontorinis Vasileios<br>Phd student=
, University of California San Diego<br>San Diego, CA 92122<br>
Cell. phone: (858) 717 6899<br>

</div><a href=3D"mailto:bkontorinis@gmail.com" target=3D"_blank">bkontorini=
s@gmail.com</a>, <a href=3D"mailto:vkontori@ucsd.edu" target=3D"_blank">vko=
ntori@ucsd.edu</a><br>-----------------------------------------------------=
--------------<br>


<br><br><div class=3D"gmail_quote">2010/1/16 Shanti Subramanyam <span dir=
=3D"ltr">&lt;<a href=3D"mailto:shanti.subramanyam@gmail.com" target=3D"_bla=
nk">shanti.subramanyam@gmail.com</a>&gt;</span><div><div></div><div><br>
<blockquote class=3D"gmail_quote" style=3D"border-left:1px solid rgb(204, 2=
04, 204);margin:0pt 0pt 0pt 0.8ex;padding-left:1ex">

I would really recommend that you migrate to Olio 0.2. In addition to bug f=
ixes, there are some major features changes in it. See=A0<a href=3D"http://=
perfwork.wordpress.com/2010/01/13/olio-0-2-relesed/" target=3D"_blank">Olio=
 0.2=A0released</a>=A0<div>


<br></div><div><font color=3D"#888888">Shanti</font><div><div></div><div><b=
r><br><div class=3D"gmail_quote">On Sat, Jan 16, 2010 at 4:49 PM, Vasileios=
 Kontorinis <span dir=3D"ltr">&lt;<a href=3D"mailto:bkontorinis@gmail.com" =
target=3D"_blank">bkontorinis@gmail.com</a>&gt;</span> wrote:<br>


<blockquote class=3D"gmail_quote" style=3D"border-left:1px solid rgb(204, 2=
04, 204);margin:0pt 0pt 0pt 0.8ex;padding-left:1ex">
Akara hi again,<br>=A0=A0 Below I have comments on your suggestions and at =
the end some bonus questions... Thanks again.<br><br><div class=3D"gmail_qu=
ote">2010/1/13 Akara Sucharitakul <span dir=3D"ltr">&lt;<a href=3D"mailto:A=
kara.Sucharitakul@sun.com" target=3D"_blank">Akara.Sucharitakul@sun.com</a>=
&gt;</span><br>


<blockquote class=3D"gmail_quote" style=3D"border-left:1px solid rgb(204, 2=
04, 204);margin:0pt 0pt 0pt 0.8ex;padding-left:1ex">


With your permission, I&#39;d like to copy the Olio and Faban user aliases =
going forward. I feel it will help a much wider audience. Please see below =
for answers/comments:<br>
<br></blockquote><div>Sure. I cced olio user alias. I am not sure which is =
the right faban list.<br>=A0<br></div><blockquote class=3D"gmail_quote" sty=
le=3D"border-left:1px solid rgb(204, 204, 204);margin:0pt 0pt 0pt 0.8ex;pad=
ding-left:1ex">


Vasileios Kontorinis wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"border-left:1px solid rgb(204, 2=
04, 204);margin:0pt 0pt 0pt 0.8ex;padding-left:1ex"><div>
Akara hi,<br>
 =A0 I am a grad student at UCSD and I use Olio for a research project wher=
e we want to measure olio performance under live virtual machine migration.=
 We use ubuntu 8.04 on nehalem servers.<br></div>
I have co ed the last version of olio from the online svn repository and do=
wnloaded the last version of faban (faban-kit-101509.tar.gz &lt;<a href=3D"=
http://faban.sunsource.net/nightly/faban-kit-101509.tar.gz" target=3D"_blan=
k">http://faban.sunsource.net/nightly/faban-kit-101509.tar.gz</a>&gt;)<br>


</blockquote>
<br>
101509 is fairly recent. But the latest on the web site is 111109 (Faban 1.=
0). There were just bug fixes between those releases.</blockquote><div><br>=
I have upgraded to Faban 1.0, still using olio1.0 though ( the release of 2=
.0 was announced, will switch to it if I run into bugs that have been fixed=
) <br>


=A0</div><blockquote class=3D"gmail_quote" style=3D"border-left:1px solid r=
gb(204, 204, 204);margin:0pt 0pt 0pt 0.8ex;padding-left:1ex"><div><br>
<br>
<blockquote class=3D"gmail_quote" style=3D"border-left:1px solid rgb(204, 2=
04, 204);margin:0pt 0pt 0pt 0.8ex;padding-left:1ex">
<br>
So far, I employed a bunch of hacks to get most of it to work and I am almo=
st there. In the process I got a bunch of questions.<br>
<br>
Questions (some of them might be just faban related, not olio so bear with =
me):<br>
1) In there any way to deploy OlioDriver.jar through the command line? Fire=
fox through ssh forwarding is dead slow and I d rather avoid if I can.<br>
</blockquote>
<br></div>
Just drop the jar into faban/benchmarks/ and it will deploy itself. This is=
 documented at <a href=3D"http://faban.sunsource.net/1.0/docs/guide/harness=
dev/deploybenchmark.html" target=3D"_blank">http://faban.sunsource.net/1.0/=
docs/guide/harnessdev/deploybenchmark.html</a> under &quot;Alternate Deploy=
ment Methods.&quot;<div>


<br>
<br>
<blockquote class=3D"gmail_quote" style=3D"border-left:1px solid rgb(204, 2=
04, 204);margin:0pt 0pt 0pt 0.8ex;padding-left:1ex">
2) The services ApacheHttpdService, MemcachedService, MySQLService that com=
e with Faban should be deployed before running Olio?<br>
 =A0 =A0I was getting some very weird errors. e.g.<br>
</blockquote>
<br></div>
Yes, you should. Olio will search for those.<div><div></div><div><br></div>=
</div></blockquote><div>Done<br>=A0</div><blockquote class=3D"gmail_quote" =
style=3D"border-left:1px solid rgb(204, 204, 204);margin:0pt 0pt 0pt 0.8ex;=
padding-left:1ex">


<div><div>
<br>
<blockquote class=3D"gmail_quote" style=3D"border-left:1px solid rgb(204, 2=
04, 204);margin:0pt 0pt 0pt 0.8ex;padding-left:1ex">
<br>
03:50:27:WARNING:UIDriverAgent[0]: Forcefully terminating benchmark run<br>
03:50:27:WARNING:UIDriverAgent[0]: 25 threads forcefully terminated.<br>
java.lang.Throwable: Stack of non-terminating thread.<br>
 =A0 =A0at java.net.SocketInputStream.socketRead0 (null)<br>
 =A0 =A0at java.net.SocketInputStream.read (129)<br>
 =A0 =A0at java.io.FilterInputStream.read (116)<br>
 =A0 =A0at com.sun.faban.driver.transport.util.TimedInputStream.read (139)<=
br>
 =A0 =A0at java.io.BufferedInputStream.fill (218)<br>
 =A0 =A0at java.io.BufferedInputStream.read (237)<br>
 =A0 =A0at org.apache.commons.httpclient.HttpParser.readRawLine (78)<br>
 =A0 =A0at org.apache.commons.httpclient.HttpParser.readLine (106)<br>
 =A0 =A0at org.apache.commons.httpclient.HttpConnection.readLine (1116)<br>
 =A0 =A0at org.apache.commons.httpclient.HttpMethodBase.readStatusLine (197=
3)<br>
 =A0 =A0at org.apache.commons.httpclient.HttpMethodBase.readResponse (1735)=
<br>
 =A0 =A0at org.apache.commons.httpclient.HttpMethodBase.execute (1098)<br>
 =A0 =A0at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetr=
y (398)<br>
 =A0 =A0at org.apache.commons.httpclient.HttpMethodDirector.executeMethod (=
171)<br>
 =A0 =A0at org.apache.commons.httpclient.HttpClient.executeMethod (397)<br>
 =A0 =A0at org.apache.commons.httpclient.HttpClient.executeMethod (323)<br>
 =A0 =A0at com.sun.faban.driver.transport.hc3.ApacheHC3Transport.fetchURL (=
529)<br>
 =A0 =A0at com.sun.faban.driver.transport.hc3.ApacheHC3Transport.fetchURL (=
552)<br>
 =A0 =A0at org.apache.olio.workload.driver.UIDriver.doHomePage (355)<br>
 =A0 =A0at sun.reflect.NativeMethodAccessorImpl.invoke0 (null)<br>
 =A0 =A0at sun.reflect.NativeMethodAccessorImpl.invoke (39)<br>
 =A0 =A0at sun.reflect.DelegatingMethodAccessorImpl.invoke (25)<br>
 =A0 =A0at java.lang.reflect.Method.invoke (597)<br>
 =A0 =A0at com.sun.faban.driver.engine.TimeThread.doRun (169)<br>
 =A0 =A0at com.sun.faban.driver.engine.AgentThread.run (202)<br>
<br>
and afterwards the master was waiting for threads to join for ever... (I at=
tached gdb to verify that something was wrong) and hence I had to kill the =
benchmark.<br>
</blockquote>
<br></div></div>
These threads are hanging reading the server responses, that never came.<di=
v><br></div></blockquote><div><br>Building the services from Faban probably=
 fixes it. <br><br>=A0</div><blockquote class=3D"gmail_quote" style=3D"bord=
er-left:1px solid rgb(204, 204, 204);margin:0pt 0pt 0pt 0.8ex;padding-left:=
1ex">


<div>
<br>
<blockquote class=3D"gmail_quote" style=3D"border-left:1px solid rgb(204, 2=
04, 204);margin:0pt 0pt 0pt 0.8ex;padding-left:1ex">
<br>
In the Olio log there are WARNINGS =A0complaining about not deploying those=
. After building those and manually copying them to /faban/services (ant de=
ploy did not place them there... :-( =A0)<br>
</blockquote>
<br></div>
Yes. But ant deploy should get them there. If not, can you please let me kn=
ow the ant messages?</blockquote><div>=A0<br>Ant was deploying them indeed.=
 I had a mistake in building.properties.<br>I had:=A0 faban.url=3Dhttp://&l=
t;hostname&gt;:9980/=A0=A0 instead of=A0 faban.url=3D<a href=3D"http://loca=
lhost:9980/" target=3D"_blank">http://localhost:9980/</a><br>


After I changed that it started working... <br>=A0</div><blockquote class=
=3D"gmail_quote" style=3D"border-left:1px solid rgb(204, 204, 204);margin:0=
pt 0pt 0pt 0.8ex;padding-left:1ex"><div>
<br>
<blockquote class=3D"gmail_quote" style=3D"border-left:1px solid rgb(204, 2=
04, 204);margin:0pt 0pt 0pt 0.8ex;padding-left:1ex">
it worked. (mostly worked)<br>
<br>
3) I still have warnings like:<br>
01:38:08:INFO:Time difference to host olio-web is 269 ms. Attempting to set=
 clock.<br>
01:38:08:INFO:Time difference to host olio-db is 263 ms. Attempting to set =
clock.<br>
</blockquote>
<br></div>
These two are OK. Just trying to do a clock sync between the systems.<div><=
br>
<br>
<blockquote class=3D"gmail_quote" style=3D"border-left:1px solid rgb(204, 2=
04, 204);margin:0pt 0pt 0pt 0.8ex;padding-left:1ex">
01:38:08:WARNING:olio-web wakeup-before time reached 700ms limit. System is=
 too busy. Giving up.<br>
</blockquote>
<br></div>
This is one of Faban&#39;s clock-setting calibrations. If the system is too=
 busy or you run on some virtualization architectures, the lag time between=
 an intended end of sleep and the actual time when the thread really wakes =
up (gets scheduled/executed) is too high, calibrations will fail.<div>


<br>
<br>
<blockquote class=3D"gmail_quote" style=3D"border-left:1px solid rgb(204, 2=
04, 204);margin:0pt 0pt 0pt 0.8ex;padding-left:1ex">
01:38:08:INFO:Time difference to host olio-mem is 262 ms. Attempting to set=
 clock.<br>
01:38:10:WARNING:olio-db wakeup-before time reached 700ms limit. System is =
too busy. Giving up.<br>
09:38:09:WARNING:[date, -u, 011309382010.10]<br>
stderr:<br>
date: cannot set date: Operation not permitted<br>
09:38:09:WARNING:Error on &quot;[date, -u, 011309382010.10]&quot; command t=
rying to set the date. Exit value: 1<br>
09:38:10:WARNING:[date, -u, 011309382010.11]<br>
stderr:<br>
date: cannot set date: Operation not permitted<br>
09:38:10:WARNING:Error on &quot;[date, -u, 011309382010.11]&quot; command t=
rying to set the date. Exit value: 1<br>
09:38:09:WARNING:[date, -u, 011309382010.10]<br>
stderr:<br>
date: cannot set date: Operation not permitted<br>
09:38:09:WARNING:Error on &quot;[date, -u, 011309382010.10]&quot; command t=
rying to set the date. Exit value: 1<br>
09:38:10:WARNING:[date, -u, 011309382010.11]<br>
stderr:<br>
date: cannot set date: Operation not permitted<br>
<br>
Leting faban change the vm clock sounds from the beginning a bad idea.<br>
</blockquote>
<br></div>
OK. So it is xen. Yes, this is what Faban is trying to solve. You can certa=
inly turn it off. Please see:<br>
 =A0 <a href=3D"http://faban.sunsource.net/1.0/docs/howdoi/physclocksync.ht=
ml" target=3D"_blank">http://faban.sunsource.net/1.0/docs/howd services Apa=
cheHttpdService, MemcachedService, MySQLService that come with Faban should=
 be deployed before running Olio?<br>


 =A0 =A0I was gettingoi/physclocksync.html</a><div><br></div></blockquote><=
div><br>I added the=A0 <span style=3D"font-family:monospace">&lt;</span><sp=
an style=3D"font-family:monospace">fh:timeSync</span><span style=3D"font-fa=
mily:monospace">&gt;<span style=3D"font-weight:bold">false</span>&lt;fh:tim=
eSync&gt; in my run.xml file ( btw in the link above there is a mistake :=
=A0 </span><span style=3D"font-family:monospace">&lt;</span><span style=3D"=
font-family:monospace">fh:timeSync</span><span style=3D"font-family:monospa=
ce">&gt;<span style=3D"font-weight:bold">false</span>&lt;/fh:timeSync&gt;</=
span> is correct, the second &lt;fh:timeSync&gt; needs a closing tag, the &=
quot;/&quot; is missing)<br>


that made the warnings go away.<br><br>=A0</div><blockquote class=3D"gmail_=
quote" style=3D"border-left:1px solid rgb(204, 204, 204);margin:0pt 0pt 0pt=
 0.8ex;padding-left:1ex"><div>
<br>
<blockquote class=3D"gmail_quote" style=3D"border-left:1px solid rgb(204, 2=
04, 204);margin:0pt 0pt 0pt 0.8ex;padding-left:1ex">
Unfortunately, xen is really bad in maintaining an accurate clock. As a res=
ult there is usually time difference between the different virtual machines=
<br>
of more than 10ms. I went over the setTime function in Faban source (/faban=
/com/sun/faban/harness/agent/CmdAgentImpl.java), it&#39;s big and ugly (ver=
y ugly)<br>
</blockquote>
<br></div>
Thanks for the compliments! I think you mean CmdService.setClockTask. Time =
sensitive code ain&#39;t pretty. It is the complexities dealing with the cl=
ock and trying to achieve good accuracy. If you think you can simplify this=
, I&#39;m listening (without loosing the accuracy, of course). In compariso=
n, CmdAgentImpl has nothing.<div>


<br></div></blockquote><div><br>Yes, you r right it is  CmdService.setClock=
Task. The previous email was composed at 3am ... :-)<br>I am still a little=
 confused.=A0 the setClockTask is used to set the clock so that all the mac=
hines are synchronized with master. From what you mentioned the physical cl=
ock sync is only used for the logs.<br>


Why do we need to do that since 1) it requires root privileges (which might=
 not be always available) 2) I could imagine an alternative that uses delta=
s from the actual physical clock without having to set it. <br>( I am proba=
bly missing something... :-)<br>


<br><br></div><blockquote class=3D"gmail_quote" style=3D"border-left:1px so=
lid rgb(204, 204, 204);margin:0pt 0pt 0pt 0.8ex;padding-left:1ex"><div>
<br>
<blockquote class=3D"gmail_quote" style=3D"border-left:1px solid rgb(204, 2=
04, 204);margin:0pt 0pt 0pt 0.8ex;padding-left:1ex">
Why there is this strict requirement for 10ms difference? Any ideas?<br>
</blockquote>
<br></div>
It is easily achievable in most cases. May not be true for VMs.<br>
<br>
On some VM architectures, the OS however does not get scheduled till way af=
ter that, thus causing problems. You may be able to measure performance on =
those VMs. But you don&#39;t want to use such VMs to be a driver. Your resp=
onse time measurements will be way off.<br>


<br>
The physical clock sync is not really rigorous. And you can turn it off. It=
 is more to keep the systems in good time sync. If your VM stands in the wa=
y, just turn it off. The driver&#39;s virtual clock sync is much more picky=
 in comparison. This is because the start time for the steady state should =
be the same (with a very small tolerance) no matter how many drivers are dr=
iving. Otherwise the measurement period won&#39;t be the same when viewed f=
rom different drivers and the results won&#39;t be reliable.<div>


<br>
<br>
<blockquote class=3D"gmail_quote" style=3D"border-left:1px solid rgb(204, 2=
04, 204);margin:0pt 0pt 0pt 0.8ex;padding-left:1ex">
Even with ntp it&#39;s hard to provide the 10ms guarantee.<br>
</blockquote>
<br></div>
That&#39;s why we don&#39;t use ntp ;-)</blockquote><div><br>Just out of cu=
riosity, the physical clocks are set only once at the beginning (right?), t=
herefore for long runs the 10ms difference will not be guaranteed. Nope? Es=
pecially under VMs I &#39;ve seen significant clock difference withing a fe=
w minutes.=A0 <br>


At least ntp can periodically resync (of course doing so, might screw up th=
e logs with time going backwards etc)<br>=A0</div><blockquote class=3D"gmai=
l_quote" style=3D"border-left:1px solid rgb(204, 204, 204);margin:0pt 0pt 0=
pt 0.8ex;padding-left:1ex">


<div><br>
<br>
<blockquote class=3D"gmail_quote" style=3D"border-left:1px solid rgb(204, 2=
04, 204);margin:0pt 0pt 0pt 0.8ex;padding-left:1ex">
I am thinking of modifying this function to always return that the time dif=
ference is less than 10ms (so that I do not have to wait all the time for t=
he timeouts.)<br>
</blockquote>
<br></div>
Why bother. Don&#39;t like it, just turn it off. It has good use in most co=
nfigurations we&#39;re dealing with. And, it avoids ntp inaccuracies.<div><=
br>
<br>
<blockquote class=3D"gmail_quote" style=3D"border-left:1px solid rgb(204, 2=
04, 204);margin:0pt 0pt 0pt 0.8ex;padding-left:1ex">
Will this break anything in Olio?<br>
</blockquote>
<br></div>
Nope. Except the times in your logs will appear out of sequence. They rely =
on the local time on the originating systems.<br>
<br>
<blockquote class=3D"gmail_quote" style=3D"border-left:1px solid rgb(204, 2=
04, 204);margin:0pt 0pt 0pt 0.8ex;padding-left:1ex">
<br>
4) Warning like:<br>
09:39:48:WARNING:Image at <a href=3D"http://olio-web:80/fileService.php?cac=
he=3Dfalse&amp;file=3De168t.jpg" target=3D"_blank">http://olio-web:80/fileS=
ervice.php?cache=3Dfalse&amp;file=3De168t.jpg</a> &lt;<a href=3D"http://oli=
o-web:80/fileService.php?cache=3Dfalse&amp;file=3De168t.jpg" target=3D"_bla=
nk">http://olio-web:80/fileService.php?cache=3Dfalse&amp;file=3De168t.jpg</=
a>&gt; size of 249 bytes is too small. Image may not exist<br>


can be ignored, right?<br>
</blockquote>
<br>
Well, something is wrong. We don&#39;t have images that small. Check whethe=
r e168t.jpg is really that small. That&#39;s why we have that warning.<div>=
<br></div></blockquote><div><br><br>It kinda funny, my problem was that I h=
ad the olio webkit version installed and then I downloaded the version from=
 the online svn repository. I built the driver but forgot to update the web=
page for my apache server.=A0 Which <br>


as expected was the source for many of my issues. <br><br><br>=A0</div><blo=
ckquote class=3D"gmail_quote" style=3D"border-left:1px solid rgb(204, 204, =
204);margin:0pt 0pt 0pt 0.8ex;padding-left:1ex"><div>
<br>
<blockquote class=3D"gmail_quote" style=3D"border-left:1px solid rgb(204, 2=
04, 204);margin:0pt 0pt 0pt 0.8ex;padding-left:1ex">
<br>
5) Last and most important.<br>
I can run the benchmark and all the operation succeed but for login.<br>
I get a bunch of:<br>
<br>
09:39:25:WARNING:UIDriverAgent[0].2.doLogin: Found login prompt at index 29=
26, Login as at786o08x, 2178 failed.<br>
Note: Error not counted in result.<br>
Either transaction start or end time is not within steady state.<br>
java.lang.RuntimeException: Found login prompt at index 2926, Login as at78=
6o08x, 2178 failed.<br>
 =A0 =A0at org.apache.olio.workload.driver.UIDriver.doLogin (404)<br>
 =A0 =A0at sun.reflect.NativeMethodAccessorImpl.invoke0 (null)<br>
 =A0 =A0at sun.reflect.NativeMethodAccessorImpl.invoke (39)<br>
 =A0 =A0at sun.reflect.DelegatingMethodAccessorImpl.invoke (25)<br>
 =A0 =A0at java.lang.reflect.Method.invoke (597)<br>
 =A0 =A0at com.sun.faban.driver.engine.TimeThread.doRun (169)<br>
 =A0 =A0at com.sun.faban.driver.engine.AgentThread.run (202)<br>
<br>
Any ideas? I do get<br>
</blockquote>
<br></div>
You likely have cookie issues. It can&#39;t seem to hold on to a session.<d=
iv><br>
<br></div></blockquote><div><br>Well there was a permission issue with the =
http_session dir. I could not right to it. chmod 777 it fixed this. <br><br=
></div><blockquote class=3D"gmail_quote" style=3D"border-left:1px solid rgb=
(204, 204, 204);margin:0pt 0pt 0pt 0.8ex;padding-left:1ex">


<br><div><br>
<br>
<blockquote class=3D"gmail_quote" style=3D"border-left:1px solid rgb(204, 2=
04, 204);margin:0pt 0pt 0pt 0.8ex;padding-left:1ex">
<br>
(I ve found online: <a href=3D"http://www.mail-archive.com/olio-dev@incubat=
or.apache.org/msg00647.html" target=3D"_blank">http://www.mail-archive.com/=
olio-dev@incubator.apache.org/msg00647.html</a> which is similar, but when =
I added<br>


<br>
com.sun.faban.driver.transport.sunhttp.ThreadCookieHandler.level=3DFINER =
=A0in build.properties <br>
I did not see any cookie related warnings. Those should appear in the olio =
run log or the apache log, right? Am i just looking at the wrong place? )<b=
r>
</blockquote>
<br></div>
Yes, that&#39;s applicable only to the Sun Http Transport. The version of O=
lio you&#39;re using is based on the Apache Http Transport (Apache HttpClie=
nt 3.1). The ThreadCookieHandler is not used for the Apache transport and t=
hat&#39;s why you don&#39;t see any logs. Try upgrade to Faban 1.0 before l=
ooking at other things.<br>


<br>
<blockquote class=3D"gmail_quote" style=3D"border-left:1px solid rgb(204, 2=
04, 204);margin:0pt 0pt 0pt 0.8ex;padding-left:1ex"><div>
<br>
<br>
It&#39;s a long email I know. Your feedback would be most appreciated.<br>
<br>
-Regards<br>
-------------------------------------------------------------------<br>
Kontorinis Vasileios<br>
Phd student, University of California San Diego<br>
San Diego, CA 92122<br>
Cell. phone: (858) 717 6899<br>
</div><a href=3D"mailto:bkontorinis@gmail.com" target=3D"_blank">bkontorini=
s@gmail.com</a> &lt;mailto:<a href=3D"mailto:bkontorinis@gmail.com" target=
=3D"_blank">bkontorinis@gmail.com</a>&gt;, <a href=3D"mailto:vkontori@ucsd.=
edu" target=3D"_blank">vkontori@ucsd.edu</a> &lt;mailto:<a href=3D"mailto:v=
kontori@ucsd.edu" target=3D"_blank">vkontori@ucsd.edu</a>&gt;<br>


-------------------------------------------------------------------\<br>
</blockquote>
<br>
Thanks for all the questions/comments.<br><font color=3D"#888888">
<br>
-Akara<br>
<br>
</font></blockquote></div><br><br>And now some more questions/ comments:<br=
>1) I get the following error:<br><br>15:13:05:SEVERE:CmdService: Getting -=
 exception reading /usr/data/olio-db.err<br>java.io.FileNotFoundException: =
File /usr/data/olio-db.err does not exist.<br>


=A0=A0=A0 at com.sun.faban.common.FileTransfer.&lt;init&gt; (70)<br>=A0=A0=
=A0 at com.sun.faban.harness.agent.FileAgentImpl.get (315)<br>=A0=A0=A0 at =
sun.reflect.NativeMethodAccessorImpl.invoke0 (null)<br>=A0=A0=A0 at sun.ref=
lect.NativeMethodAccessorImpl.invoke (39)<br>


=A0=A0=A0 at sun.reflect.DelegatingMethodAccessorImpl.invoke (25)<br>=A0=A0=
=A0 at java.lang.reflect.Method.invoke (597)<br>=A0=A0=A0 at sun.rmi.server=
.UnicastServerRef.dispatch (305)<br>=A0=A0=A0 at sun.rmi.transport.Transpor=
t$1.run (159)<br>=A0=A0=A0 at java.security.AccessController.doPrivileged (=
null)<br>


=A0=A0=A0 at sun.rmi.transport.Transport.serviceCall (155)<br>=A0=A0=A0 at =
sun.rmi.transport.tcp.TCPTransport.handleMessages (535)<br>=A0=A0=A0 at sun=
.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0 (790)<br>=A0=A0=A0 a=
t sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run (649)<br>


=A0=A0=A0 at java.util.concurrent.ThreadPoolExecutor$Worker.runTask (885)<b=
r>=A0=A0=A0 at java.util.concurrent.ThreadPoolExecutor$Worker.run (907)<br>=
=A0=A0=A0 at java.lang.Thread.run (619)<br>=A0=A0=A0 at sun.rmi.transport.S=
treamRemoteCall.exceptionReceivedFromServer (255)<br>


=A0=A0=A0 at sun.rmi.transport.StreamRemoteCall.executeCall (233)<br>=A0=A0=
=A0 at sun.rmi.server.UnicastRef.invoke (142)<br>=A0=A0=A0 at com.sun.faban=
.harness.agent.FileAgentImpl_Stub.get (null)<br>=A0=A0=A0 at com.sun.faban.=
harness.engine.CmdService.get (1334)<br>


=A0=A0=A0 at com.sun.faban.harness.RunContext.getFile (346)<br>=A0=A0=A0 at=
 com.sun.services.MySQLService.getLogs (197)<br>=A0=A0=A0 at sun.reflect.Na=
tiveMethodAccessorImpl.invoke0 (null)<br>=A0=A0=A0 at sun.reflect.NativeMet=
hodAccessorImpl.invoke (39)<br>


=A0=A0=A0 at sun.reflect.DelegatingMethodAccessorImpl.invoke (25)<br>=A0=A0=
=A0 at java.lang.reflect.Method.invoke (597)<br>=A0=A0=A0 at com.sun.faban.=
harness.util.Invoker.invoke (98)<br>=A0=A0=A0 at com.sun.faban.harness.serv=
ices.ServiceWrapper.getLogs (200)<br>


=A0=A0=A0 at com.sun.faban.harness.services.ServiceManager.getLogs (642)<br=
>=A0=A0=A0 at com.sun.faban.harness.engine.GenericBenchmark.start (323)<br>=
=A0=A0=A0 at com.sun.faban.harness.engine.RunDaemon.run (338)<br>=A0=A0=A0 =
at java.lang.Thread.run (619)<br>


15:13:05:WARNING:Could not copy /usr/data/olio-db.err to /home/gdhiman/faba=
n.1.0/faban/output/OlioDriver.2D/mysql_err.log.olio-db<br><br>Apparently so=
mething is misconfigured in my db-server. Any ideas? <br><br>2) I get the f=
ollowing error:<br>


15:13:16:WARNING:[/home/gdhiman/faban.1.0/faban/bin/fenxi, process, /home/g=
dhiman/faban.1.0/faban/output/OlioDriver.2D/, /home/gdhiman/faban.1.0/faban=
/output/OlioDriver.2D//post/, OlioDriver.2D]<br>stderr:<br>Error in executi=
ng perl /home/gdhiman/faban.1.0/faban/master/webapps/fenxi/txt2db/<a href=
=3D"http://mpstat.pl" target=3D"_blank">mpstat.pl</a><br>


Error in executing perl /home/gdhiman/faban.1.0/faban/master/webapps/fenxi/=
txt2db/<a href=3D"http://mpstat.pl" target=3D"_blank">mpstat.pl</a><br><br>=
Actually I traced back this one. The problem is the difference in output fo=
rmat of the Sun&#39;s mpstat and default GNU mpstat. <br>


This is my output of my mpstat:<br><br>gdhiman@olio-client00:~/faban.1.0/fa=
ban/output/OlioDriver.2D$ mpstat 1=A0=A0=A0 <br>Linux 2.6.18.8-xen (olio-cl=
ient00) =A0=A0=A0 01/16/10<br><br>16:25:06=A0=A0=A0=A0 CPU=A0=A0 %user=A0=
=A0 %nice=A0=A0=A0 %sys %iowait=A0=A0=A0 %irq=A0=A0 %soft=A0 %steal=A0=A0 %=
idle=A0=A0=A0 intr/s<br>


16:25:07=A0=A0=A0=A0 all=A0=A0=A0 0.00=A0=A0=A0 0.00=A0=A0=A0 0.00=A0=A0=A0=
 0.00=A0=A0=A0 0.00=A0=A0=A0 0.00=A0=A0=A0 0.00=A0 100.00=A0=A0=A0=A0 52.48=
<br>16:25:08=A0=A0=A0=A0 all=A0=A0=A0 0.00=A0=A0=A0 0.00=A0=A0=A0 0.00=A0=
=A0=A0 0.00=A0=A0=A0 0.00=A0=A0=A0 0.00=A0=A0=A0 0.00=A0 100.00=A0=A0=A0=A0=
 50.50<br>16:25:09=A0=A0=A0=A0 all=A0=A0=A0 0.00=A0=A0=A0 0.00=A0=A0=A0 0.0=
0=A0=A0=A0 0.00=A0=A0=A0 0.00=A0=A0=A0 0.00=A0=A0=A0 0.00=A0 100.00=A0=A0=
=A0=A0 79.21<br>


16:25:10=A0=A0=A0=A0 all=A0=A0=A0 0.00=A0=A0=A0 0.00=A0=A0=A0 0.00=A0=A0=A0=
 0.00=A0=A0=A0 0.00=A0=A0=A0 0.00=A0=A0=A0 0.00=A0 100.00=A0=A0=A0=A0 45.54=
<br>16:25:11=A0=A0=A0=A0 all=A0=A0=A0 0.00=A0=A0=A0 0.00=A0=A0=A0 0.00=A0=
=A0=A0 0.00=A0=A0=A0 0.00=A0=A0=A0 0.00=A0=A0=A0 0.00=A0 100.00=A0=A0=A0=A0=
 55.45<br><br>The first line as well as the time at the beginning of each e=
ntry messing up the parsing at <a href=3D"http://mpstat.pl" target=3D"_blan=
k">mpstat.pl</a>. (also the fields are different) =A0 Any plans to support =
this??<br>


<br>3) Scaling questions.<br>- So far I did not have a single experiment pa=
ssing. Some are pretty close with only one metric check failing. <br><br><t=
able style=3D"border:2px solid rgb(204, 204, 204);padding:2px;text-align:ce=
nter;width:100%" border=3D"0" cellpadding=3D"4" cellspacing=3D"3">


<tbody><tr><td style=3D"text-align:left">Average images loaded per Home Pag=
e</td><td>2.79</td><td>&gt;=3D 3</td><td><br></td><td style=3D"color:rgb(25=
5, 0, 0)">

FAILED</td></tr></tbody></table><br>Any ideas? Is it the case that the disc=
 is not fast enough? I am just using the local filesystem for the filestore=
.<br><br>- As I double the number of concurrent users I observe linear scal=
ing in the thoughput. <br>


Con Users=A0=A0=A0=A0=A0=A0=A0=A0 Throughput<br>=A025=A0=A0=A0=A0=A0=A0=A0=
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 4.967<br>=A050=A0=A0=A0=A0=
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 10.06<br>100=A0=A0=
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 19.375<br>200=A0=
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 40.21<br>400=
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 75.818<br>8=
00 =A0=A0 =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 0.383<br>


1000=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 0.483<br><=
br>The linear scaling stops for 400 concurrent users ( only one agent). Act=
ually it would be exactly linear (value of ~80) but almost half of the logi=
n operations failed. I am looking into it.<br>


Any insights on what might be the first thing failing?<br><br>For the 800 a=
nd 1000 experiments there are no failed operations logged. It looks like th=
ose are being discarded... (?)<br><br>Bonus question:<br>In the runtime sta=
tistics <br>


<span style=3D"font-family:monospace">&lt;</span><span style=3D"font-family=
:monospace">runtimeStats</span><span style=3D"font-family:monospace">
enabled</span><span style=3D"font-family:monospace">=3D</span><span style=
=3D"font-family:monospace">&quot;<span style=3D"font-weight:bold">true</spa=
n>&quot;</span><span style=3D"font-family:monospace">&gt;</span><br style=
=3D"font-family:monospace">


<span style=3D"font-family:monospace"> =A0
=A0 =A0 =A0 &lt;</span><span style=3D"font-family:monospace">interval</span=
><span style=3D"font-family:monospace">&gt;<span><span style=3D"font-weight=
:bold">30</span></span><span style=3D"font-weight:bold"></span>&lt;/</span>=
<span style=3D"font-family:monospace">interval</span><span style=3D"font-fa=
mily:monospace">&gt;</span><br style=3D"font-family:monospace">


<span style=3D"font-family:monospace"> &lt;/</span><span style=3D"font-fami=
ly:monospace">runtimeStats</span><span style=3D"font-family:monospace">&gt;=
</span><br><br>only the 90% response time is reported. Is there an easy way=
 to also report the 99% ? ( or I need to add code for that?)<br>


<br><br>Thanks a lot again in advance.<br><font color=3D"#888888">-VK<br>
</font></blockquote></div><br></div></div></div>
</blockquote></div></div></div><br>
</blockquote></div><br></div></div></div></div>
</blockquote></div><br>
</blockquote></div><br></div></div></div>
</blockquote></div><br>
</div></div></div></div>
</blockquote></div></div></div><br>
</blockquote></div></div></div><br>
</blockquote></div></div></div><br>
</blockquote></div></div></div><br></div>
</blockquote></div><br></div>

--00504502e31a533104047f6fa728--