httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allen Pulsifer" <pulsif...@comcast.net>
Subject RE: httpd attempts to open file.html/.htaccess (is this a bug?)
Date Tue, 12 Jun 2007 17:29:08 GMT
> > When processing a "GET /.../file.html", Apache httpd briefly treats 
> > file.html as a directory and attempts to open 
> > "docroot/.../file.html/.htaccess".  The os returns ENOTDIR, 
> and then 
> > processing of the request continues.

> Yes, this is a somewhat known issue.  Previously it caused 
> issues with earlier versions of reiserfs4: 
> http://issues.apache.org/bugzilla/show_bug.cgi?id=31126
> 
> Rici explains more details here: 
> http://marc.info/?l=apache-httpd-dev&m=109470495819687&w=4


Hello Paul and Dev List,

Thanks for the reply.  I checked out the links and did some code tracing
with the debugger.  As one of the links pointed out, the problem is in the
block of code attached below from ap_directory_walk() in server/request.c

This block of code is contained in the directory walk that looks for sym
links and .htaccess files.  It is executed immediately after appending the
next path segment, which is either a subdirectory or the file name.

The "if" test at the top of the block attempts to optimize by skipping the
statements the follow it.  The comment on the "if" test states:

  * If...we knew r->filename was a file, and
  * if...we have strict (case-sensitive) filenames, or
  *      we know the canonical_filename matches to _this_ name, and
  * if...we have allowed symlinks
  * skip the lstat and dummy up an APR_DIR value for thisinfo.

The first problem with the "if" test is that it doesn't recognize when the
segment is actually the file name itself, and therefore the type is APR_REG
rather than APR_DIR.  This could easily be fixed, but there may be a few
other problems.

First, it should be mentioned that the optimization can be removed and then
httpd will behave correctly: it will not do a spurious access on
file.html/.htaccess.  However, when the optimization is removed, it will
then do a stat on each component in the file path, when it might not need
to.

Let's first look at the lines of code that follows the "if" optimization and
look at the conditions under which they are not necessary.  Before starting
though, let's note that prior to beginning the directory walk,
ap_directory_walk() does a stat on the full file name, using the
APR_FINFO_MIN parameter.

Later, without optimization, it would then do a stat on each component in
the path, as follows:

1. Do a stat on the path component, looking at the link info
(APR_FINFO_LINK) rather than the target info.

2. Test if stat returned an error.  Note that since the initial stat on the
full path did not return an error, the stat on the component will never
return as error (assuming the program logic is correct).  This can therefore
always be optimized out.

3. Fix up the path name if the actual component name info does not match.  A
mismatch is only possible with a file system that is not case sensitive, and
therefore can be optimized out if either (a) the file system is case
sensitive or (b) we already know they match; or (c) we don't care if they
match or not.

4. If the path is a link, run resolve_symlink().  This function will always
return success when OPT_SYM_LINKS (FollowSymLinks) is enabled.

5. If the path points at anything other than a directory, end processing.

So basically, these processing steps can be skipped whenever (1)
FollowSymLinks is enabled AND (2) the file system is case sensitive.

It seems to me that the optimization should actually read:

If (filesystem is case sensitive AND OPT_SYM_LINKS is enabled AND we did a
successful stat on the full file path) Then:

{	If (the path to test is the fill path AND full path points at a
regular file) Then: end processing

	Else: assume path to test is a dir and skip the stat
}

These are the two things I'm concerned about:

1. In the current optimization, the comment says:

  * if...we have strict (case-sensitive) filenames, or
  *      we know the canonical_filename matches to _this_ name, and

while the actual code says:

#ifdef CASE_BLIND_FILESYSTEM
                && (filename_len <= canonical_len)
#endif

At first examination, it looks the comment describes the correct
implementation, but how does the test for "filename_len <= canonical_len"
ensure that "canonical_filename matches to _this_ name".  Can anyone verify
this is correct?

2. When OPT_SYM_LINKS is enabled, resolve_symlink() does not test
OPT_SYM_OWNER, i.e., OPT_SYM_LINKS overrides OPT_SYM_OWNER.  The
optimization however insists that OPT_SYM_LINKS is set while OPT_SYM_OWNER
is unset.

Which of these two are correct?  Should resolve_symlink() always check
OPT_SYM_OWNER, even if OPT_SYM_LINKS is enabled, or should the optimization
only check OPT_SYM_LINKS?

Thanks,

Allen


---------

THE PROBLEMATIC BLOCK OF CODE

            /* First optimization;
             * If...we knew r->filename was a file, and
             * if...we have strict (case-sensitive) filenames, or
             *      we know the canonical_filename matches to _this_ name,
and
             * if...we have allowed symlinks
             * skip the lstat and dummy up an APR_DIR value for thisinfo.
             */
            if (r->finfo.filetype
#ifdef CASE_BLIND_FILESYSTEM
                && (filename_len <= canonical_len)
#endif
                && ((opts.opts & (OPT_SYM_OWNER | OPT_SYM_LINKS)) ==
OPT_SYM_LINKS))
            {

                thisinfo.filetype = APR_DIR;
                ++seg;
                continue;
            }

            /* We choose apr_stat with flag APR_FINFO_LINK here, rather that
             * plain apr_stat, so that we capture this path object rather
than
             * its target.  We will replace the info with our target's info
             * below.  We especially want the name of this 'link' object,
not
             * the name of its target, if we are fixing the filename
             * case/resolving aliases.
             */
            rv = apr_stat(&thisinfo, r->filename,
                          APR_FINFO_MIN | APR_FINFO_NAME | APR_FINFO_LINK,
                          r->pool);

            if (APR_STATUS_IS_ENOENT(rv)) {
                /* Nothing?  That could be nice.  But our directory
                 * walk is done.
                 */
                thisinfo.filetype = APR_NOFILE;
                break;
            }
            else if (APR_STATUS_IS_EACCES(rv)) {
                ap_log_rerror(APLOG_MARK, APLOG_ERR, rv, r,
                              "access to %s denied", r->uri);
                return r->status = HTTP_FORBIDDEN;
            }
            else if ((rv != APR_SUCCESS && rv != APR_INCOMPLETE)
                     || !(thisinfo.valid & APR_FINFO_TYPE)) {
                /* If we hit ENOTDIR, we must have over-optimized, deny
                 * rather than assume not found.
                 */
                ap_log_rerror(APLOG_MARK, APLOG_ERR, rv, r,
                              "access to %s failed", r->uri);
                return r->status = HTTP_FORBIDDEN;
            }

            /* Fix up the path now if we have a name, and they don't agree
             */
            if ((thisinfo.valid & APR_FINFO_NAME)
                && strcmp(seg_name, thisinfo.name)) {
                /* TODO: provide users an option that an internal/external
                 * redirect is required here?  We need to walk the URI and
                 * filename in tandem to properly correlate these.
                 */
                strcpy(seg_name, thisinfo.name);
                filename_len = strlen(r->filename);
            }

            if (thisinfo.filetype == APR_LNK) {
                /* Is this a possibly acceptable symlink?
                 */
                if ((res = resolve_symlink(r->filename, &thisinfo,
                                           opts.opts, r->pool)) != OK) {
                    ap_log_rerror(APLOG_MARK, APLOG_ERR, 0, r,
                                  "Symbolic link not allowed "
                                  "or link target not accessible: %s",
                                  r->filename);
                    return r->status = res;
                }
            }

            /* Ok, we are done with the link's info, test the real target
             */
            if (thisinfo.filetype == APR_REG ||
                thisinfo.filetype == APR_NOFILE) {
                /* That was fun, nothing left for us here
                 */
                break;
            }
            else if (thisinfo.filetype != APR_DIR) {
                ap_log_rerror(APLOG_MARK, APLOG_ERR, 0, r,
                              "Forbidden: %s doesn't point to "
                              "a file or directory",
                              r->filename);
                return r->status = HTTP_FORBIDDEN;
            }

            ++seg;
        } while (thisinfo.filetype == APR_DIR);
---------


Mime
View raw message