httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tony Finch <...@dotat.at>
Subject Re: filehandle caching and nfs
Date Fri, 03 May 2002 00:42:15 GMT
On Thu, May 02, 2002 at 04:49:45PM -0600, David Bishop wrote:
> 
> I have a problem with our apache webserver (v. 1.3.14), running on solaris 7.
> A lot of our directories are auto-nfs mounted (esp. the ~username stuff).
> 95% of the time it works great, however, intermittently, it will return "no
> such file or directory", for files that were there 5 minutes before, and
> still are there if you rsh to the webserver and look at the filesystem.  And,
> most importantly, refreshing the page "fixes" it (i.e., returns the correct
> page).
> 
> My theory for what's happening is that the automounter daemon unmount the fs
> after 5 minutes of inactivity, but that apache "caches" the filehandle that
> it used the last time.  Then, when you go to hit that page again, it looks at
> "/amd/u2pesfs2/blah/foo", rather than "/u/blah/foo", which (obviously)
> doesn't work as the /amd/... has been unmounted, and isn't automatically
> remounted just by referencing it (as opposed to /u/...).  It then returns a
> 404, and flushed the fh out of it's cache. Thus, the next time you request
> the page, it goes for the /u/blah/foo, the fs is automounted again, and
> everything is fine.

I have seen this problem before on a Solaris system that was doing large
scale virtual hosting, and the mapping from virtual hosts to physical
directories was done via the automounter (rather than via symlinks as one
would do on other unices). The problem is nothing to with Apache itself,
but Apache exposes a performance problem in the Solaris auto mounter.
The problem became more noticable as the load on the system increased.

In the typical case path lookups that go through an automounted directory
hit a cache in the kernel, and it happens that this cache lookup is faster
- O(1) - than a UFS directory lookup - O(N) - for large directories,
hence using it for large-scale vhosting. When the cache lookup fails,
an upcall is made to the userland automountd, which does a lookup in the
automount tables -- which in my case were large text files, and in your
case may be the password file or NIS maps. The upcall to the automountd
is expensive, and the automountd only handles one upcall at a time.

The problem is that the kernel does not do negative cacheing of automount
lookups, so if you repeatedly request a missing file in an automounted
directory you can overload the automountd. In this situation, you don't
block waiting for the automountd, you get an error return. I can't
remember exactly the errno value, but the result was a "403 Forbidden"
from Apache. This might have changed in more recent versions of Solaris --
we were running 2.6, and ISTR reporting the problem to Sun but before
they got anywhere near a fix (which took far longer than we could wait)
I had an adequate work-around so I didn't pursue a proper solution. They
might have just changed the errno value to ENOENT...

The feature of Apache that caused this overloading of the automountd
is that it looked for .htaccess in the automounted directory on every
request, which caused a cache miss and a consequent upcall to the
automountd on every request. You can spot that this is happening if
the automountd is using an implausible proportion of the CPU. In our
case the automount maps were text files which were loaded and parsed
for every automountd request, which is what caused the unnatural CPU
usage and time delay on automountd requests; if you are using a different
source for automount maps (particularly NIS) then your failure mode will
have different details.

The solution is either to alter your Apache configuration so that it
doesn't look for .htaccess files in the automounted directory, e.g.
(I think -- try it and see) instead of

	<Directory /home/>
		AllowOverride all
	</Directory>

(which looks for /home/.htaccess on every request) do

	<Directory /home/*/>
		AllowOverride all
	</Directory>

Alternatively (if you have the right kind of automounter configuration)
add a bogus .htaccess entry in your automount maps which causes the
kernel cache lookup to succeed with a reference to an empty file or
something equally inoccuous.

Pointing strace at the automountd will tell you if my guess is right.
If it isn't, at least I got to tell an amusing war story.

Tony.
-- 
f.a.n.finch <dot@dotat.at> http://dotat.at/
BAILEY: VARIABLE BECOMING SOUTHEASTERLY 3 OR 4, INCREASING 5 OR 6 IN WEST
LATER. SHOWERS. GOOD.

Mime
View raw message