I am using v3.

What I observed and made me suspicious of nfs is in syslog a bunch of messages:
May 22 00:22:37 olio-web -- MARK --
May 22 00:30:29 olio-web kernel: [2167206.713881] nfs: server 67.58.51.149 not responding, still trying
May 22 00:31:03 olio-web kernel: [2167240.993893] nfs: server 67.58.51.149 not responding, still trying
May 22 00:31:06 olio-web kernel: [2167243.193896] nfs: server 67.58.51.149 not responding, still trying
May 22 00:31:07 olio-web kernel: [2167244.349889] nfs: server 67.58.51.149 not responding, still trying
May 22 00:31:07 olio-web kernel: [2167244.357893] nfs: server 67.58.51.149 not responding, still trying
May 22 00:31:07 olio-web kernel: [2167244.669885] nfs: server 67.58.51.149 not responding, still trying
May 22 00:31:10 olio-web kernel: [2167247.781891] nfs: server 67.58.51.149 not responding, still trying
May 22 00:31:10 olio-web kernel: [2167247.785889] nfs: server 67.58.51.149 not responding, still trying
May 22 00:31:11 olio-web kernel: [2167248.725885] nfs: server 67.58.51.149 not responding, still trying
May 22 00:31:13 olio-web kernel: [2167250.153892] nfs: server 67.58.51.149 not responding, still trying
May 22 00:31:14 olio-web kernel: [2167251.173886] nfs: server 67.58.51.149 not responding, still trying
May 22 00:31:14 olio-web kernel: [2167251.410700] nfs: server 67.58.51.149 OK
May 22 00:31:14 olio-web kernel: [2167251.411158] nfs: server 67.58.51.149 OK
May 22 00:31:14 olio-web kernel: [2167251.411236] nfs: server 67.58.51.149 OK
May 22 00:31:14 olio-web kernel: [2167251.411249] nfs: server 67.58.51.149 OK
May 22 00:31:14 olio-web kernel: [2167251.411354] nfs: server 67.58.51.149 OK
May 22 00:31:14 olio-web kernel: [2167251.411678] nfs: server 67.58.51.149 OK
May 22 00:31:14 olio-web kernel: [2167251.411691] nfs: server 67.58.51.149 OK
May 22 00:31:14 olio-web kernel: [2167251.411712] nfs: server 67.58.51.149 OK
May 22 00:31:14 olio-web kernel: [2167251.411723] nfs: server 67.58.51.149 OK
May 22 00:31:14 olio-web kernel: [2167251.412047] nfs: server 67.58.51.149 OK
May 22 00:31:14 olio-web kernel: [2167251.414051] nfs: server 67.58.51.149 OK
May 22 00:42:37 olio-web -- MARK --
May 22 01:02:37 olio-web -- MARK --

Regarding the olio run log it's weird.
Usually says nothing.  Sometimes it gives a bunch of

UIDriverAgent[0].752.do <op> : chunked stream ended unexpectedly
Note: Error not counted in result.
Either transaction start or end time is not within steady state.

and also

UIDriverAgent[0].635.do <op> : connect timed out
Note: Error not counted in result.
Either transaction start or end time is not within steady state.

where <op> different operations....

But it does not log this every time...

Is there a way to monitor nfs internals (buffering, throughput etc) on the fly?
I am using sar but it does not give much helpful info?

-------------------------------------------------------------------
Kontorinis Vasileios
Phd student, University of California San Diego
http://cseweb.ucsd.edu/~vkontori/
bkontorinis@gmail.com
-------------------------------------------------------------------


2010/5/24 Shanti Subramanyam <shanti.subramanyam@gmail.com>
Which version of NFS are you using ? I suggest you try v3 - it is more efficient. We've run into issues with v4 causing unacceptable response times.
For the kind of drop you are seeing, I am surprised that you don't find any errors anywhere.  In the first 400 user run for example, it looks like all the processes either exited or are stalled. In either case, I would expect to see errors in the faban run log (either that the driver got an error or that it timed out). Are you sure you checked the faban log ?

Have you tried running nfsstat to see if you can spot anything ?

Shanti


On Sun, May 23, 2010 at 7:07 PM, Vasileios Kontorinis <bkontorinis@gmail.com> wrote:
Shanti hi again,
    I sort of managed to fix that. I tried upgrading my php version to 5.2.6 and the alert went away. My problems though are not fixed. 
I even tried completely removing suhosin patch (it was a huge pain in ubuntu, since you need to recompile the php module by yourself)
Still though my proms are there.

Now, I get no warning the logs are clean but I get weird behavior. I needed to send you guys some pics so I created a related page at:
I have comments describing the prom at the end. 

Any help would be most appreciated. I ve spent so much time on it without figuring it out. 
My configuration is 1 web server on a vm with 6GB of mem. 4 cpus
                               1 db server on a vm with  5GB of mem. 4 cpus
                               1 fs server on a vm with  4GB of mem. 4 cpus. (this one just exposes over NFS the filestore)
All on the same physical machine a nehalem based server, siting on a Sun's Black box. 
I got similar behavior when I exposed the filestore on the Sun's thumper. 

Any help would be most appreciated. 

Thanks

-------------------------------------------------------------------
Kontorinis Vasileios
Phd student, University of California San Diego
http://cseweb.ucsd.edu/~vkontori/
bkontorinis@gmail.com
-------------------------------------------------------------------


2010/5/19 Shanti Subramanyam <shanti.subramanyam@gmail.com>

It's strange that multiple files seem to be complaining about it. Did you try disabling Suhosin ? Are you seeing a perceptible drop in memory after reaching steady state ?

shanti


On Wed, May 19, 2010 at 4:28 PM, Vasileios Kontorinis <bkontorinis@gmail.com> wrote:
Lately I get a bunch of these errors in my logs:

[Wed May 19 22:26:37 2010] [error] [client 10.17.255.250] ALERT - canary mismatch on efree() - heap overflow detected (attacker '10.17.255.250', file '/var/www/oliophp/public_html/taggedEvents.php')
[Wed May 19 22:26:37 2010] [error] [client 10.17.255.250] ALERT - canary mismatch on efree() - heap overflow detected (attacker '10.17.255.250', file '/var/www/oliophp/public_html/taggedEvents.php')
[Wed May 19 22:26:37 2010] [error] [client 10.17.255.250] ALERT - canary mismatch on efree() - heap overflow detected (attacker '10.17.255.250', file '/var/www/oliophp/public_html/users.php')
[Wed May 19 22:26:37 2010] [error] [client 10.17.255.250] ALERT - canary mismatch on efree() - heap overflow detected (attacker '10.17.255.250', file '/var/www/oliophp/public_html/events.php')
[Wed May 19 22:26:37 2010] [error] [client 10.17.255.250] ALERT - canary mismatch on efree() - heap overflow detected (attacker '10.17.255.250', file '/var/www/oliophp/public_html/taggedEvents.php')

According to blogs it is a php related issue. Suhosin patch detects a memory overflow and complains.
I was just wondering if the Olio php code is having any known mem. leaks.

My php version on ubuntu:
PHP 5.2.4-2ubuntu5 with Suhosin-Patch 0.9.6.2 (cli) (built: Feb 27 2008 20:46:51)
Copyright (c) 1997-2007 The PHP Group
Zend Engine v2.2.0, Copyright (c) 1998-2007 Zend Technologies

It's too bad that I do not get a line on the php files that cause this.


Has anyone come across this one before?


-------------------------------------------------------------------
Kontorinis Vasileios
Phd student, University of California San Diego
San Diego, CA 92122
Cell. phone: (858) 717 6899
bkontorinis@gmail.com, vkontori@ucsd.edu
-------------------------------------------------------------------