apr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kevin Pilch-Bisson <ke...@pilch-bisson.net>
Subject Re: apr_realpath
Date Wed, 14 Mar 2001 13:48:39 GMT
Here are some things I got from this.

On Tue, Mar 13, 2001 at 01:40:02AM -0600, William A. Rowe, Jr. wrote:
> Here is what I've worked up thus far for Unix... n'er mind the _more_ complicated
> Win32 beast [all I could do to get the _root_ parsing finished tonight!]
> 
> So feel free to look at the server root parse function, and the unix implementation
> of apr_filepath_merge(), and let me know if I've gone off in the wrong direction.
> 
> Survey; is there any reason -not- to canonicalize a name, ever (/./ elimination, etc)?

Not that I can think of.
> 
> Is there a decent semantic to append a missing trailing slash for a directory vs. a file?

Not sure exactly what you are asking.
> 
> Should we have a char** input argument for addpath (who cares about realpath) so that
> we can actually return the user the existing path, while incrementing the pointer to
the
> start of the non-existinant path (a slow operation, by request only)?

Why?
> 
> Should we persist in forcing Win32 to use slashes, or should we return backslashes?

Subversion uses native path separators. I don't know what the rest of
APR/Apache do.
> 
> Last comment (not a q)... I will round out the Win32 code and add apr_pathname_segment()
> fn to pull apart a path, which is what I got out of kevin's request, and what I need
> for the Apache core.

Actually, I am looking for a function to portably, always get an absolut
path from a given string.  I'll try to give you a use case for it.  In
subversion, we have the idea of commits, where you can give a list of
targets on the command line.

kevin@pilchie:/home/kevin/repos$ svn commit file1 dir/file2 /home/kevin/repos/dir2/file3

From that command, we would like to be able to take the three file
arguments, and find out what is the deepest directory which is common to
all three targets, since that is where we have to start making changes
to our repository.  In the above example that would be
/home/kevin/repos.  We already have a function which takes a list of
(absolute) targets, finds the common part of the path, and converts the
other targets to relative paths from that base.  Now we need a portable
way to convert the targets given above into absolute paths.  The
problem we ran into is that while realpath exists on unix, and _fullpath
exists on win32, there is nothing (that we knew about anyway, on BeOS).
Also, realpath is known to be broken on some older versions of Solaris,
in that it merely canonicalizes a path if it is below the cwd.  For now,
we are just using realpath and _fullpath (and not putting our heads in
the sand with respect to Solaris and BeOS.

> 
> Bill
> 
> Index: file_io/unix/filepath.c
> ===================================================================
> RCS file: filepath.c
> diff -N filepath.c
> --- /dev/null Mon Mar 12 23:32:12 2001
> +++ filepath.c Mon Mar 12 23:32:23 2001
> @@ -0,0 +1,233 @@
> +/* ====================================================================
> + * The Apache Software License, Version 1.1
> + *
> + * Copyright (c) 2000-2001 The Apache Software Foundation.  All rights
> + * reserved.
> + *
> + * Redistribution and use in source and binary forms, with or without
> + * modification, are permitted provided that the following conditions
> + * are met:
> + *
> + * 1. Redistributions of source code must retain the above copyright
> + *    notice, this list of conditions and the following disclaimer.
> + *
> + * 2. Redistributions in binary form must reproduce the above copyright
> + *    notice, this list of conditions and the following disclaimer in
> + *    the documentation and/or other materials provided with the
> + *    distribution.
> + *
> + * 3. The end-user documentation included with the redistribution,
> + *    if any, must include the following acknowledgment:
> + *       "This product includes software developed by the
> + *        Apache Software Foundation (http://www.apache.org/)."
> + *    Alternately, this acknowledgment may appear in the software itself,
> + *    if and wherever such third-party acknowledgments normally appear.
> + *
> + * 4. The names "Apache" and "Apache Software Foundation" must
> + *    not be used to endorse or promote products derived from this
> + *    software without prior written permission. For written
> + *    permission, please contact apache@apache.org.
> + *
> + * 5. Products derived from this software may not be called "Apache",
> + *    nor may "Apache" appear in their name, without prior written
> + *    permission of the Apache Software Foundation.
> + *
> + * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED
> + * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
> + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
> + * DISCLAIMED.  IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR
> + * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF
> + * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
> + * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
> + * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
> + * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
> + * SUCH DAMAGE.
> + * ====================================================================
> + *
> + * This software consists of voluntary contributions made by many
> + * individuals on behalf of the Apache Software Foundation.  For more
> + * information on the Apache Software Foundation, please see
> + * <http://www.apache.org/>.
> + */
> +
> +#include "apr.h"
> +#include "fileio.h"
> +#include "apr_file_io.h"
> +#include "apr_strings.h"
> +#if APR_HAVE_UNISTD_H
> +#include <unistd.h>
> +#endif
> +
> +#include <direct.h>
> +
> +
> +APR_DECLARE(apr_status_t) apr_filepath_root(const char **rootpath,
> +                                            const char **inpath, apr_pool_t *p)
> +{
> +    if (**inpath == '/') {
> +        *rootpath = apr_pstrdup(p, "/");
> +        ++*inpath;
> +        return APR_EABSOLUTE;
> +    }
> +
> +    *rootpath = apr_pstrdup(p, "");
> +    return APR_ERELATIVE;
> +}
> +
> +
> +APR_DECLARE(apr_status_t)
> +                apr_filepath_merge(char **newpath, const char *rootpath,
> +                                   const char *addpath, apr_int32_t flags,
> +                                   apr_pool_t *p)
> +{
> +    char path[APR_PATH_MAX];
> +    apr_size_t newseg, addseg, endseg, rootlen;
> +
> +    if (!rootpath) {
> +        /* Start with the current working path
> +         * XXX: Any kernel subject to goofy, uncanonical results
> +         * must test the cwd against the user's given flags.
> +         * Simplest would be a recursive call to apr_filepath_merge
> +         * with an empty (not null) rootpath and addpath of the cwd.
> +         */
> +        if (!(rootpath = getcwd(path, sizeof(path)))) {

Um, rootpath is const char *, should we be assigning values to it?

> +            if (errno == ERANGE)
> +                return APR_ENAMETOOLONG;
> +            else
> +                return errno;
> +        }
> +        newseg = strlen(rootpath);
> +    }
> +    else {
> +        /* Accept the given rootpath as pre-tested and simply append to it */
> +        newseg = strlen(rootpath);
> +        if (newseg >= sizeof(path))
> +            return APR_ENAMETOOLONG;
> +        strcpy(path, rootpath);
> +    }
> +
> +    rootlen = newseg;
> +
> +    /* tack on a trailing '/' if we have anything to append */
> +    if (newseg && addpath[0] && path[newseg - 1] != '/') {

what if addpath is NULL?

> +        path[newseg++] = '/';
> +        path[newseg] = '\0';
> +    }
> +
> +    addseg = newseg;
> +
> +    /* strip down leading '/'s to a single leading '/',
> +     * entirely replacing the root path.
> +     */

Why not test this before bothering to get the rootpath in the first
place?

> +    if (addpath[0] == '/') {
> +        if (flags & APR_FILEPATH_SECUREROOTTEST)
> +            return APR_EABOVEROOT;
> +        newseg = 0;
> +        strcpy (path, "/");
> +        addseg = 1;
> +        while (addpath[0])
> +            ++addpath;
> +    } else if (path[0] != '/') {
> +        if (flags & APR_FILEPATH_NOTRELATIVE)
> +            return APR_ERELATIVE;
> +    }
> +
> +    if (path[0] == '/' && (flags & APR_FILEPATH_NOTABSOLUTE))
> +        return APR_EABSOLUTE;
> +
> +    while (*addpath) {
> +        /* Parse each segment, find the closing '/' */
> +        endseg = 0;
> +        while (addpath[endseg] && addpath[endseg] != '/')
> +            ++endseg;
> +
> +        if (endseg == 0 || (endseg == 1 && addpath[0] == '.')) {
> +            /* noop segment (/ or ./) so skip it */
> +        }
> +        else if (endseg == 2 && addpath[0] == '.' && addpath[1] == '.')
{
> +            /* backpath (../) */
> +            if (addseg == 1 && path[0] == '/') {
> +                /* above root?  die if we are fixated on security */
> +                if (flags & APR_FILEPATH_SECUREROOTTEST)
> +                    return APR_EABOVEROOT;
> +                /* otherwise a noop, above root is ... root */
> +                newseg = 0;
> +            }
> +            else if (addseg == 0 || (addseg >= 3
> +                                  && strcmp(path + addseg - 3, "../") == 0))
{
> +                /* already backpathed or empty, append a backpath unless
> +                 * the user insisten never to continue above the root.
> +                 */
> +                if (flags & APR_FILEPATH_SECUREROOTTEST)
> +                    return APR_EABOVEROOT;
> +                if (addseg + 3 >= sizeof(path))
> +                    return APR_ENAMETOOLONG;
> +                strcpy(path + addseg, "../");
> +                addseg += 3;
> +            }
> +            else {
> +                /* otherwise crop prior segment */
> +                do {
> +                    --addseg;
> +                } while (addseg && path[addseg - 1] != '/');
> +                path[addseg] = '\0';
> +            }
> +
> +            /* Now test if we are above where we started */
> +            if (addseg < newseg) {
> +                if (flags & APR_FILEPATH_SECUREROOTTEST)
> +                    return APR_EABOVEROOT;
> +                newseg = addseg;
> +            }
> +        }
> +        else {
> +            /* actual segment, append */
> +            apr_size_t i = (addpath[endseg] != '\0');
> +            if (addseg + endseg + i >= sizeof(path))
> +                return APR_ENAMETOOLONG;
> +            strncpy(path + addseg, addpath, endseg + i);
> +            addseg += endseg + i;
> +        }
> +
> +        /* skip over trailing slash */
> +        if (addpath[endseg])
> +            ++endseg;
> +
> +        addpath += endseg;
> +    }
> +
> +    if ((flags & APR_FILEPATH_NOTABOVEROOT) && newseg < rootlen) {
> +        /* If newseg moved back, the end result string must be tested
> +         * that it is still within the root.  Only the _SECUREROOT
> +         * option prohibits the user from 'testing' the absolute
> +         * location of a file by backing over and readding it.
> +         */
> +        if (strncmp(rootpath, path, rootlen))
> +            return APR_EABOVEROOT;
> +        if (rootpath[rootlen - 1] != '/'
> +                && path[rootlen] && path[rootlen] != '/')
> +            return APR_EABOVEROOT;
> +    }
> +
> +#if 0
> +    /* Just an idea - don't know where it's headed yet */
> +    if (addpath && addpath[endseg - 1] != '/'
> +                && (flags & APR_FILEPATH_TRUECASE)) {
> +        apr_finfo_t finfo;
> +        if (apr_stat(&finfo, path, APR_FINFO_TYPE, p) == APR_SUCCESS) {
> +            if (addpath[endseg - 1] != finfo.filetype == APR_DIR) {
> +                if (endseg + 1 >= sizeof(path))
> +                    return APR_ENAMETOOLONG;
> +                path[endseg++] = '/';
> +                path[endseg] = '\0';
> +            }
> +        }
> +    }
> +#endif
> +
> +    /* Whew... */
> +    *newpath = apr_pstrdup(p, path);
> +    return (newpath ? APR_SUCCESS : APR_ENOMEM);
> +}
> Index: file_io/win32/filepath.c
> ===================================================================
> RCS file: filepath.c
> diff -N filepath.c
> --- /dev/null Mon Mar 12 23:32:12 2001
> +++ filepath.c Mon Mar 12 23:32:25 2001
> @@ -0,0 +1,357 @@
> +/* ====================================================================
> + * The Apache Software License, Version 1.1
> + *
> + * Copyright (c) 2000-2001 The Apache Software Foundation.  All rights
> + * reserved.
> + *
> + * Redistribution and use in source and binary forms, with or without
> + * modification, are permitted provided that the following conditions
> + * are met:
> + *
> + * 1. Redistributions of source code must retain the above copyright
> + *    notice, this list of conditions and the following disclaimer.
> + *
> + * 2. Redistributions in binary form must reproduce the above copyright
> + *    notice, this list of conditions and the following disclaimer in
> + *    the documentation and/or other materials provided with the
> + *    distribution.
> + *
> + * 3. The end-user documentation included with the redistribution,
> + *    if any, must include the following acknowledgment:
> + *       "This product includes software developed by the
> + *        Apache Software Foundation (http://www.apache.org/)."
> + *    Alternately, this acknowledgment may appear in the software itself,
> + *    if and wherever such third-party acknowledgments normally appear.
> + *
> + * 4. The names "Apache" and "Apache Software Foundation" must
> + *    not be used to endorse or promote products derived from this
> + *    software without prior written permission. For written
> + *    permission, please contact apache@apache.org.
> + *
> + * 5. Products derived from this software may not be called "Apache",
> + *    nor may "Apache" appear in their name, without prior written
> + *    permission of the Apache Software Foundation.
> + *
> + * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED
> + * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
> + * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
> + * DISCLAIMED.  IN NO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR
> + * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF
> + * USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
> + * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
> + * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
> + * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
> + * SUCH DAMAGE.
> + * ====================================================================
> + *
> + * This software consists of voluntary contributions made by many
> + * individuals on behalf of the Apache Software Foundation.  For more
> + * information on the Apache Software Foundation, please see
> + * <http://www.apache.org/>.
> + */
> +
> +#include "apr.h"
> +#include "fileio.h"
> +#include "apr_file_io.h"
> +#include "apr_strings.h"
> +
> +
> +/* Win32 Exceptions:
> + *
> + * Note that trailing spaces and trailing periods are never recorded
> + * in the file system, except by a very obscure bug where any file
> + * that is created with a trailing space or period, followed by the
> + * ':' stream designator on an NTFS volume can never be accessed again.
> + * In other words, don't ever accept them if designating a stream!
> + *
> + * An interesting side effect is that two or three periods are both
> + * treated as the parent directory, although the fourth and on are
> + * not [strongly suggest all trailing periods are trimmed off, or
> + * down to two if there are no other characters.]
> + *
> + * Leading spaces and periods are accepted, however.
> + */
> +static int is_fnchar(char ch)
> +{
> +    /* No control code between 0 and 31 is allowed
> +     * The * < > ? codes all have wildcard effects
> +     * The " / \ : are exlusively separator tokens
> +     * The system doesn't accept | for any purpose.
> +     * Oddly, \x7f is acceptable.
> +     */
> +    if (ch >= 0 || ch < 32
> +        return 0;
> +
> +    if (ch == '\"' || ch ==  '*' || ch == '/'
> +     || ch ==  ':' || ch ==  '<' || ch == '>'
> +     || ch ==  '?' || ch == '\\' || ch == '|')
> +        return 0;
> +
> +    return 1;
> +}
> +
> +
> +APR_DECLARE(apr_status_t) apr_filepath_root(const char **rootpath,
> +                                            const char **inpath, apr_pool_t *p)
> +{
> +    if (*inpath[0] == '/' || *inpath[0] == '\\') {
> +        if (*inpath[1] == '/' || *inpath[1] == '\\') {

What if you get /\blah, or \/blah?


All in all, I don't want to review this, since I don't know enough about
win32 path semantics.

[snip rest of win32 code]
> Index: include/apr_errno.h
> ===================================================================
> RCS file: /home/cvs/apr/include/apr_errno.h,v
> retrieving revision 1.54
> diff -u -r1.54 apr_errno.h
> --- include/apr_errno.h 2001/02/16 04:15:42 1.54
> +++ include/apr_errno.h 2001/03/13 07:32:36
> @@ -205,6 +205,10 @@
>   *                    platform, either because nobody has gotten to it yet,
>   *                    or the function is impossible on this platform.
>   * APR_EMISMATCH      Two passwords do not match.
> + * APR_EABSOLUTE      The given path was absolute.
> + * APR_ERELATIVE      The given path was relative.
> + * APR_EINCOMPLETE    The given path was neither relative nor absolute.
> + * APR_EABOVEROOT     The given path was above the root path.
>   * </PRE>
>   *
>   * @param status The APR_status code to check.
> @@ -236,6 +240,10 @@
>  /* empty slot: +17 */
>  /* empty slot: +18 */
>  #define APR_EDSOOPEN       (APR_OS_START_ERROR + 19)
> +#define APR_EABSOLUTE      (APR_OS_START_ERROR + 20)
> +#define APR_ERELATIVE      (APR_OS_START_ERROR + 21)
> +#define APR_EINCOMPLETE    (APR_OS_START_ERROR + 22)
> +#define APR_EABOVEROOT     (APR_OS_START_ERROR + 23)
> 
> 
>  /* APR ERROR VALUE TESTS */
> @@ -258,6 +266,10 @@
>  /* empty slot: +17 */
>  /* empty slot: +18 */
>  #define APR_STATUS_IS_EDSOOPEN(s)       ((s) == APR_EDSOOPEN)
> +#define APR_STATUS_IS_EABSOLUTE(s)      ((s) == APR_EABSOLUTE)
> +#define APR_STATUS_IS_ERELATIVE(s)      ((s) == APR_ERELATIVE)
> +#define APR_STATUS_IS_EINCOMPLETE(s)    ((s) == APR_EINCOMPLETE)
> +#define APR_STATUS_IS_EABOVEROOT(s)     ((s) == APR_EABOVEROOT)
> 
> 
>  /* APR STATUS VALUES */
> Index: include/apr_file_info.h
> ===================================================================
> RCS file: /home/cvs/apr/include/apr_file_info.h,v
> retrieving revision 1.13
> diff -u -r1.13 apr_file_info.h
> --- include/apr_file_info.h 2001/02/16 04:15:43 1.13
> +++ include/apr_file_info.h 2001/03/13 07:32:36
> @@ -246,7 +246,9 @@
> 
>  /**
>   * Read the next entry from the specified directory.
> - * @param thedir the directory descriptor to read from, and fill out.
> + * @param finfo the file info structure and filled in by apr_dir_read
> + * @param wanted The desired apr_finfo_t fields, as a bit flag of APR_FINFO_ values
> + * @param thedir the directory descriptor returned from apr_dir_open
>   * @tip All systems return . and .. as the first two files.
>   * @deffunc apr_status_t apr_dir_read(apr_finfo_t *finfo, apr_int32_t wanted, apr_dir_t
*thedir)
>   */
> @@ -260,6 +262,49 @@
>   */
>  APR_DECLARE(apr_status_t) apr_dir_rewind(apr_dir_t *thedir);
> 
> +/* apr_filepath flags
> + */
> +
> +/* Cause apr_filepath_merge to fail if addpath is above rootpath */
> +#define APR_FILEPATH_NOTABOVEROOT   0x01
> +
> +/* internal: Only meaningful with APR_FILEPATH_NOTABOVEROOT */
> +#define APR_FILEPATH_SECUREROOTTEST 0x02
> +
> +/* Cause apr_filepath_merge to fail if addpath is above rootpath,
> + * even given a rootpath /foo/bar and an addpath ../bar/bash
> + */
> +#define APR_FILEPATH_SECUREROOT     0x03
> +
> +/* Fail apr_filepath_merge if the merged path is relative */
> +#define APR_FILEPATH_NOTRELATIVE    0x04
> +
> +/* Fail apr_filepath_merge if the merged path is absolute */
> +#define APR_FILEPATH_NOTABSOLUTE    0x08
> +
> +/* Cleans all ambigious /./  or // segments
> + * if the target is a directory */
> +#define APR_FILEPATH_CANONICAL      0x10
> +
> +/* Resolve the true case of existing directories and file elements
> + * of addpath, and append a proper trailing slash if a directory
> + */
> +#define APR_FILEPATH_TRUECASE       0x20
> +
> +
> +/**
> + * Merge additional file path onto the previously processed rootpath
> + * @param newpath the merged paths returned
> + * @param rootpath the root file path (NULL uses the current working path)
> + * @param addpath the path to add to the root path
> + * @param flags the desired APR_FILEPATH_ rules to apply when merging
> + * @param p the pool to allocate the new path string from
> + * @deffunc apr_status_t apr_filepath_merge(char **newpath, const char *rootpath, const
char *addpath, apr_int32_t flags,
> apr_pool_t *p)
> + */
> +APR_DECLARE(apr_status_t)
> +                apr_filepath_merge(char **newpath, const char *rootpath,
> +                                   const char *addpath, apr_int32_t flags,
> +                                   apr_pool_t *p);
> 
>  #ifdef __cplusplus
>  }
> 
> 
> 
> 
> 

Have fun with this :)

-- 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Kevin Pilch-Bisson                    http://www.pilch-bisson.net
     "Historically speaking, the presences of wheels in Unix
     has never precluded their reinvention." - Larry Wall
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Mime
View raw message