incubator-heraldry-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ket...@apache.org
Subject svn commit: r463033 - in /incubator/heraldry/libraries/python/urljr/trunk: ./ admin/ doc/ urljr/ urljr/test/
Date Wed, 11 Oct 2006 23:08:08 GMT
Author: keturn
Date: Wed Oct 11 16:08:07 2006
New Revision: 463033

URL: http://svn.apache.org/viewvc?view=rev&rev=463033
Log:
Initial import of Python urljr libraries from JanRain.
(urljr was a support package for OpenID and Yadis in 1.x versions 
of the library.)

Previous development history for this library may be found in the 
darcs repository at
http://www.openidenabled.com/resources/repos/python/urljr/

Added:
    incubator/heraldry/libraries/python/urljr/trunk/COPYING
    incubator/heraldry/libraries/python/urljr/trunk/MANIFEST.in
    incubator/heraldry/libraries/python/urljr/trunk/README
    incubator/heraldry/libraries/python/urljr/trunk/admin/
    incubator/heraldry/libraries/python/urljr/trunk/admin/makedist   (with props)
    incubator/heraldry/libraries/python/urljr/trunk/admin/runtests   (with props)
    incubator/heraldry/libraries/python/urljr/trunk/admin/setversion   (with props)
    incubator/heraldry/libraries/python/urljr/trunk/doc/
    incubator/heraldry/libraries/python/urljr/trunk/setup.cfg
    incubator/heraldry/libraries/python/urljr/trunk/setup.py   (with props)
    incubator/heraldry/libraries/python/urljr/trunk/urljr/
    incubator/heraldry/libraries/python/urljr/trunk/urljr/__init__.py
    incubator/heraldry/libraries/python/urljr/trunk/urljr/fetchers.py
    incubator/heraldry/libraries/python/urljr/trunk/urljr/test/
    incubator/heraldry/libraries/python/urljr/trunk/urljr/test/__init__.py
    incubator/heraldry/libraries/python/urljr/trunk/urljr/test/helper.py
    incubator/heraldry/libraries/python/urljr/trunk/urljr/test/test_all.py
    incubator/heraldry/libraries/python/urljr/trunk/urljr/test/test_fetchers.py
    incubator/heraldry/libraries/python/urljr/trunk/urljr/test/test_urinorm.py
    incubator/heraldry/libraries/python/urljr/trunk/urljr/test/urinorm.txt
    incubator/heraldry/libraries/python/urljr/trunk/urljr/urinorm.py

Added: incubator/heraldry/libraries/python/urljr/trunk/COPYING
URL: http://svn.apache.org/viewvc/incubator/heraldry/libraries/python/urljr/trunk/COPYING?view=auto&rev=463033
==============================================================================
--- incubator/heraldry/libraries/python/urljr/trunk/COPYING (added)
+++ incubator/heraldry/libraries/python/urljr/trunk/COPYING Wed Oct 11 16:08:07 2006
@@ -0,0 +1,20 @@
+urljr - URL related utilities
+
+Copyright (C) 2006 JanRain, Inc.
+
+This library is free software; you can redistribute it and/or
+modify it under the terms of the GNU Lesser General Public
+License as published by the Free Software Foundation; either
+version 2.1 of the License, or (at your option) any later version.
+
+This library is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+Lesser General Public License for more details.
+
+You should have received a copy of the GNU Lesser General Public
+License along with this library; if not, write to the Free Software
+Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA
+
+More info about Python OpenID:
+dev@lists.openidenabled.com

Added: incubator/heraldry/libraries/python/urljr/trunk/MANIFEST.in
URL: http://svn.apache.org/viewvc/incubator/heraldry/libraries/python/urljr/trunk/MANIFEST.in?view=auto&rev=463033
==============================================================================
--- incubator/heraldry/libraries/python/urljr/trunk/MANIFEST.in (added)
+++ incubator/heraldry/libraries/python/urljr/trunk/MANIFEST.in Wed Oct 11 16:08:07 2006
@@ -0,0 +1,3 @@
+graft admin
+graft urljr/test
+include COPYING

Added: incubator/heraldry/libraries/python/urljr/trunk/README
URL: http://svn.apache.org/viewvc/incubator/heraldry/libraries/python/urljr/trunk/README?view=auto&rev=463033
==============================================================================
--- incubator/heraldry/libraries/python/urljr/trunk/README (added)
+++ incubator/heraldry/libraries/python/urljr/trunk/README Wed Oct 11 16:08:07 2006
@@ -0,0 +1,24 @@
+urljr
+-----
+
+URL-related utilites from JanRain, Inc.
+
+This package contains the "fetchers" module, which provides a common
+interface to urllib2 and curl for making HTTP requests.
+
+Dependencies:
+
+ * python (version 2.2 or better)
+
+
+Recommended:
+
+ * PycURL - http://pycurl.sourceforge.net/
+   
+   PycURL offers more protection when fetching URLs from untrusted sources,
+   e.g. checking the server's SSL certificate, limiting the number of
+   redirects it will follow, etc.
+
+
+Please send us your questions, comments, patches, and bug reports at
+dev@lists.openidenabled.com.

Added: incubator/heraldry/libraries/python/urljr/trunk/admin/makedist
URL: http://svn.apache.org/viewvc/incubator/heraldry/libraries/python/urljr/trunk/admin/makedist?view=auto&rev=463033
==============================================================================
--- incubator/heraldry/libraries/python/urljr/trunk/admin/makedist (added)
+++ incubator/heraldry/libraries/python/urljr/trunk/admin/makedist Wed Oct 11 16:08:07 2006
@@ -0,0 +1,40 @@
+#!/bin/bash
+
+set -e
+
+if [ -z $1 ] ; then
+   echo Usage: $0 version
+   exit 1
+fi
+
+DIST_VERSION=$1
+ADMINDIR=$(dirname $0)
+YDIR=${ADMINDIR}/..
+QUIET="-q"
+
+PYTHON=python
+
+WORKDIR=$(mktemp -t -d python-urljr.XXXXXX) || exit 1
+
+darcs_export () {
+    DARCS_DIR=$1
+    TARGET_DIR=$2
+    # We can't darcs get straight into the mktemp dir, because the mktemp
+    # dir already exists and "darcs get" kindly does a sanity check to make
+    # sure it doesn't clobber an already existing directory.
+    TMPREPO=$2/foo.$$
+    darcs get $DARCS_DIR ${TMPREPO}
+    rm -r ${TMPREPO}/_darcs
+    mv --target-directory=${TARGET_DIR} ${TMPREPO}/*
+    rmdir $TMPREPO
+}
+
+darcs_export ${YDIR} ${WORKDIR}
+pushd $WORKDIR
+admin/setversion ${DIST_VERSION}
+admin/runtests || exit 1
+$PYTHON setup.py ${QUIET} sdist
+popd
+
+cp -v --interactive ${WORKDIR}/dist/* .
+rm -rf $WORKDIR

Propchange: incubator/heraldry/libraries/python/urljr/trunk/admin/makedist
------------------------------------------------------------------------------
    svn:executable = *

Added: incubator/heraldry/libraries/python/urljr/trunk/admin/runtests
URL: http://svn.apache.org/viewvc/incubator/heraldry/libraries/python/urljr/trunk/admin/runtests?view=auto&rev=463033
==============================================================================
--- incubator/heraldry/libraries/python/urljr/trunk/admin/runtests (added)
+++ incubator/heraldry/libraries/python/urljr/trunk/admin/runtests Wed Oct 11 16:08:07 2006
@@ -0,0 +1,26 @@
+#!/usr/bin/env python
+
+import sys
+import os.path
+
+def fixpath():
+    try:
+        this_file = __file__
+    except NameError:
+        this_file = sys.argv[0]
+
+    this_dir = os.path.dirname(
+        os.path.realpath(
+        os.path.normpath(this_file)
+        ))
+
+    repo_root_dir = os.path.dirname(this_dir)
+
+    if repo_root_dir not in sys.path:
+        print "putting %s in sys.path" % (repo_root_dir,)
+        sys.path.insert(0, repo_root_dir)
+
+if __name__ == '__main__':
+    fixpath()
+    from urljr.test import helper, test_all
+    helper.runAsMain(test_all)

Propchange: incubator/heraldry/libraries/python/urljr/trunk/admin/runtests
------------------------------------------------------------------------------
    svn:executable = *

Added: incubator/heraldry/libraries/python/urljr/trunk/admin/setversion
URL: http://svn.apache.org/viewvc/incubator/heraldry/libraries/python/urljr/trunk/admin/setversion?view=auto&rev=463033
==============================================================================
--- incubator/heraldry/libraries/python/urljr/trunk/admin/setversion (added)
+++ incubator/heraldry/libraries/python/urljr/trunk/admin/setversion Wed Oct 11 16:08:07 2006
@@ -0,0 +1,7 @@
+#!/usr/bin/env bash
+
+cat <<EOF | \
+    xargs sed -i 's/\[library version:[^]]*\]/[library version:'"$1"']/'
+setup.py
+urljr/__init__.py
+EOF

Propchange: incubator/heraldry/libraries/python/urljr/trunk/admin/setversion
------------------------------------------------------------------------------
    svn:executable = *

Added: incubator/heraldry/libraries/python/urljr/trunk/setup.cfg
URL: http://svn.apache.org/viewvc/incubator/heraldry/libraries/python/urljr/trunk/setup.cfg?view=auto&rev=463033
==============================================================================
--- incubator/heraldry/libraries/python/urljr/trunk/setup.cfg (added)
+++ incubator/heraldry/libraries/python/urljr/trunk/setup.cfg Wed Oct 11 16:08:07 2006
@@ -0,0 +1,3 @@
+[sdist]
+force_manifest=1
+formats=gztar,zip

Added: incubator/heraldry/libraries/python/urljr/trunk/setup.py
URL: http://svn.apache.org/viewvc/incubator/heraldry/libraries/python/urljr/trunk/setup.py?view=auto&rev=463033
==============================================================================
--- incubator/heraldry/libraries/python/urljr/trunk/setup.py (added)
+++ incubator/heraldry/libraries/python/urljr/trunk/setup.py Wed Oct 11 16:08:07 2006
@@ -0,0 +1,41 @@
+#!/usr/bin/env python
+
+from distutils.core import setup
+import sys
+
+# patch distutils if it can't cope with the "classifiers" or
+# "download_url" keywords
+if sys.version < '2.2.3':
+    from distutils.dist import DistributionMetadata
+    DistributionMetadata.classifiers = None
+    DistributionMetadata.download_url = None
+
+version = '[library version:1.0.1]'[17:-1]
+
+kwargs = {
+    'name': "python-urljr",
+    'version': version,
+    'url': "http://www.openidenabled.com/openid/libraries/python/",
+    'download_url': "http://www.openidenabled.com/resources/downloads/python-openid/python-urljr-%s.tar.gz"
% (version,),
+    'author': "JanRain, Inc.",
+    'author_email': "openid@janrain.com",
+    'description': "JanRain's URL Utilities",
+    # FIXME: 'long_description': "",
+    'packages': ['urljr',
+                 ],
+    'license': "LGPL",
+    'classifiers': [
+    "Development Status :: 5 - Production/Stable",
+    "Environment :: Web Environment",
+    "Intended Audience :: Developers",
+    "License :: OSI Approved :: GNU Library or Lesser General Public License (LGPL)",
+    "Operating System :: POSIX",
+    "Programming Language :: Python",
+    "Topic :: Internet :: WWW/HTTP",
+    "Topic :: Internet :: WWW/HTTP :: Dynamic Content :: CGI Tools/Libraries",
+    "Topic :: Software Development :: Libraries :: Python Modules",
+    ]
+    }
+
+
+setup(**kwargs)

Propchange: incubator/heraldry/libraries/python/urljr/trunk/setup.py
------------------------------------------------------------------------------
    svn:executable = *

Added: incubator/heraldry/libraries/python/urljr/trunk/urljr/__init__.py
URL: http://svn.apache.org/viewvc/incubator/heraldry/libraries/python/urljr/trunk/urljr/__init__.py?view=auto&rev=463033
==============================================================================
--- incubator/heraldry/libraries/python/urljr/trunk/urljr/__init__.py (added)
+++ incubator/heraldry/libraries/python/urljr/trunk/urljr/__init__.py Wed Oct 11 16:08:07
2006
@@ -0,0 +1,15 @@
+"""URL related utilities.
+"""
+
+__version__ = '[library version:1.0.1]'[17:-1]
+
+# Parse the version info
+try:
+    version_info = map(int, __version__.split('.'))
+except ValueError:
+    version_info = (None, None, None)
+else:
+    if len(version_info) != 3:
+        version_info = (None, None, None)
+    else:
+        version_info = tuple(version_info)

Added: incubator/heraldry/libraries/python/urljr/trunk/urljr/fetchers.py
URL: http://svn.apache.org/viewvc/incubator/heraldry/libraries/python/urljr/trunk/urljr/fetchers.py?view=auto&rev=463033
==============================================================================
--- incubator/heraldry/libraries/python/urljr/trunk/urljr/fetchers.py (added)
+++ incubator/heraldry/libraries/python/urljr/trunk/urljr/fetchers.py Wed Oct 11 16:08:07
2006
@@ -0,0 +1,318 @@
+# -*- test-case-name: urljr.test.test_fetchers -*-
+"""
+This module contains the HTTP fetcher interface and several implementations.
+"""
+
+__all__ = ['fetch', 'getDefaultFetcher', 'setDefaultFetcher', 'HTTPResponse',
+           'HTTPFetcher', 'createHTTPFetcher', 'HTTPFetchingError', 'HTTPError']
+
+import urllib2
+import time
+import cStringIO
+import sys
+
+import urljr.urinorm
+
+# try to import pycurl, which will let us use CurlHTTPFetcher
+try:
+    import pycurl
+except ImportError:
+    pycurl = None
+
+def fetch(url, body=None, headers=None):
+    """Invoke the fetch method on the default fetcher. Most users
+    should need only this method.
+
+    @raises: any exceptions that may be raised by the default fetcher
+    """
+    fetcher = getDefaultFetcher()
+    return fetcher.fetch(url, body, headers)
+
+def createHTTPFetcher():
+    """Create a default HTTP fetcher instance
+
+    prefers Curl to urllib2."""
+    if pycurl is None:
+        fetcher = Urllib2Fetcher()
+    else:
+        fetcher = CurlHTTPFetcher()
+
+    return fetcher
+
+# Contains the currently set HTTP fetcher. If it is set to None, the
+# library will call createHTTPFetcher() to set it. Do not access this
+# variable outside of this module.
+_default_fetcher = None
+
+def getDefaultFetcher():
+    """Return the default fetcher instance
+    if no fetcher has been set, it will create a default fetcher.
+
+    @return: the default fetcher
+    @rtype: HTTPFetcher
+    """
+    global _default_fetcher
+
+    if _default_fetcher is None:
+        setDefaultFetcher(createHTTPFetcher())
+
+    return _default_fetcher
+
+def setDefaultFetcher(fetcher, wrap_exceptions=True):
+    """Set the default fetcher
+
+    @param fetcher: The fetcher to use as the default HTTP fetcher
+    @type fetcher: HTTPFetcher
+
+    @param wrap_exceptions: Whether to wrap exceptions thrown by the
+        fetcher wil HTTPFetchingError so that they may be caught
+        easier. By default, exceptions will be wrapped. In general,
+        unwrapped fetchers are useful for debugging of fetching errors
+        or if your fetcher raises well-known exceptions that you would
+        like to catch.
+    @type wrap_exceptions: bool
+    """
+    global _default_fetcher
+    if fetcher is None or not wrap_exceptions:
+        _default_fetcher = fetcher
+    else:
+        _default_fetcher = ExceptionWrappingFetcher(fetcher)
+
+def usingCurl():
+    """Whether the currently set HTTP fetcher is a Curl HTTP fetcher."""
+    return isinstance(getDefaultFetcher(), CurlHTTPFetcher)
+
+class HTTPResponse(object):
+    """XXX document attributes"""
+    headers = None
+    status = None
+    body = None
+    final_url = None
+
+    def __init__(self, final_url=None, status=None, headers=None, body=None):
+        self.final_url = final_url
+        self.status = status
+        self.headers = headers
+        self.body = body
+
+    def __repr__(self):
+        return "<%s status %s for %s>" % (self.__class__.__name__,
+                                          self.status,
+                                          self.final_url)
+
+class HTTPFetcher(object):
+    """
+    This class is the interface for urljr HTTP fetchers.  This
+    interface is only important if you need to write a new fetcher for
+    some reason.
+    """
+
+    def fetch(self, url, body=None, headers=None):
+        """
+        This performs an HTTP POST or GET, following redirects along
+        the way. If a body is specified, then the request will be a
+        POST. Otherwise, it will be a GET.
+
+
+        @param headers: HTTP headers to include with the request
+        @type headers: {str:str}
+
+        @return: An object representing the server's HTTP response. If
+            there are network or protocol errors, an exception will be
+            raised. HTTP error responses, like 404 or 500, do not
+            cause exceptions.
+
+        @rtype: L{HTTPResponse}
+
+        @raise Exception: Different implementations will raise
+            different errors based on the underlying HTTP library.
+        """
+        raise NotImplementedError
+
+def _allowedURL(url):
+    return url.startswith('http://') or url.startswith('https://')
+
+class HTTPFetchingError(Exception):
+    """Exception that is wrapped around all exceptions that are raised
+    by the underlying fetcher when using the ExceptionWrappingFetcher
+
+    @var why: The exception that caused this exception
+    """
+    def __init__(self, why=None):
+        Exception.__init__(self, why)
+        self.why = why
+
+class ExceptionWrappingFetcher(HTTPFetcher):
+    """Fetcher that wraps another fetcher, causing all exceptions
+
+    @var uncaught_exceptions: Exceptions that should be exposed to the
+        user if they are raised by the fetch call
+    """
+
+    uncaught_exceptions = (SystemExit, KeyboardInterrupt, MemoryError)
+
+    def __init__(self, fetcher):
+        self.fetcher = fetcher
+
+    def fetch(self, *args, **kwargs):
+        try:
+            return self.fetcher.fetch(*args, **kwargs)
+        except self.uncaught_exceptions:
+            raise
+        except:
+            exc_cls, exc_inst = sys.exc_info()[:2]
+            if exc_inst is None:
+                # string exceptions
+                exc_inst = exc_cls
+
+            raise HTTPFetchingError(why=exc_inst)
+
+class Urllib2Fetcher(HTTPFetcher):
+    """An C{L{HTTPFetcher}} that uses urllib2.
+    """
+    def fetch(self, url, body=None, headers=None):
+        if not _allowedURL(url):
+            raise ValueError('Bad URL scheme: %r' % (url,))
+
+        if headers is None:
+            headers = {}
+
+        req = urllib2.Request(url, data=body, headers=headers)
+        try:
+            f = urllib2.urlopen(req)
+            try:
+                return self._makeResponse(f)
+            finally:
+                f.close()
+        except urllib2.HTTPError, why:
+            try:
+                return self._makeResponse(why)
+            finally:
+                why.close()
+
+    def _makeResponse(self, urllib2_response):
+        resp = HTTPResponse()
+        resp.body = urllib2_response.read()
+        resp.final_url = urllib2_response.geturl()
+        resp.headers = dict(urllib2_response.info().items())
+
+        if hasattr(urllib2_response, 'code'):
+            resp.status = urllib2_response.code
+        else:
+            resp.status = 200
+
+        return resp
+
+class HTTPError(HTTPFetchingError):
+    """
+    This exception is raised by the C{L{CurlHTTPFetcher}} when it
+    encounters an exceptional situation fetching a URL.
+    """
+    pass
+
+# XXX: define what we mean by paranoid, and make sure it is.
+class CurlHTTPFetcher(HTTPFetcher):
+    """
+    An C{L{HTTPFetcher}} that uses pycurl for fetching.
+    See U{http://pycurl.sourceforge.net/}.
+    """
+    ALLOWED_TIME = 20 # seconds
+
+    def __init__(self):
+        HTTPFetcher.__init__(self)
+        if pycurl is None:
+            raise RuntimeError('Cannot find pycurl library')
+
+    def _parseHeaders(self, header_file):
+        header_file.seek(0)
+
+        # Remove the status line from the beginning of the input
+        unused_http_status_line = header_file.readline()
+        lines = [line.strip() for line in header_file]
+
+        # and the blank line from the end
+        empty_line = lines.pop()
+        if empty_line:
+            raise HTTPError("No blank line at end of headers: %r" % (line,))
+
+        headers = {}
+        for line in lines:
+            try:
+                name, value = line.split(':', 1)
+            except ValueError:
+                raise HTTPError(
+                    "Malformed HTTP header line in response: %r" % (line,))
+
+            value = value.strip()
+
+            # HTTP headers are case-insensitive
+            name = name.lower()
+            headers[name] = value
+
+        return headers
+
+    def _checkURL(self, url):
+        # XXX: document that this can be overridden to match desired policy
+        # XXX: make sure url is well-formed and routeable
+        return _allowedURL(url)
+
+    def fetch(self, url, body=None, headers=None):
+        stop = int(time.time()) + self.ALLOWED_TIME
+        off = self.ALLOWED_TIME
+
+        header_list = []
+        if headers is not None:
+            for header_name, header_value in headers.iteritems():
+                header_list.append('%s: %s' % (header_name, header_value))
+
+        c = pycurl.Curl()
+        try:
+            c.setopt(pycurl.NOSIGNAL, 1)
+
+            if header_list:
+                c.setopt(pycurl.HTTPHEADER, header_list)
+
+            # Presence of a body indicates that we should do a POST
+            if body is not None:
+                c.setopt(pycurl.POST, 1)
+                c.setopt(pycurl.POSTFIELDS, body)
+
+            while off > 0:
+                if not self._checkURL(url):
+                    raise HTTPError("Fetching URL not allowed: %r" % (url,))
+
+                data = cStringIO.StringIO()
+                response_header_data = cStringIO.StringIO()
+                c.setopt(pycurl.WRITEFUNCTION, data.write)
+                c.setopt(pycurl.HEADERFUNCTION, response_header_data.write)
+                c.setopt(pycurl.TIMEOUT, off)
+                c.setopt(pycurl.URL, urljr.urinorm.urinorm(url))
+
+                c.perform()
+
+                response_headers = self._parseHeaders(response_header_data)
+                code = c.getinfo(pycurl.RESPONSE_CODE)
+                if code in [301, 302, 303, 307]:
+                    url = response_headers.get('location')
+                    if url is None:
+                        raise HTTPError(
+                            'Redirect (%s) returned without a location' % code)
+
+                    # Redirects are always GETs
+                    c.setopt(pycurl.POST, 0)
+
+                    # There is no way to reset POSTFIELDS to empty and
+                    # reuse the connection, but we only use it once.
+                else:
+                    resp = HTTPResponse()
+                    resp.headers = response_headers
+                    resp.status = code
+                    resp.final_url = url
+                    resp.body = data.getvalue()
+                    return resp
+
+                off = stop - int(time.time())
+
+            raise HTTPError("Timed out fetching: %r" % (url,))
+        finally:
+            c.close()

Added: incubator/heraldry/libraries/python/urljr/trunk/urljr/test/__init__.py
URL: http://svn.apache.org/viewvc/incubator/heraldry/libraries/python/urljr/trunk/urljr/test/__init__.py?view=auto&rev=463033
==============================================================================
    (empty)

Added: incubator/heraldry/libraries/python/urljr/trunk/urljr/test/helper.py
URL: http://svn.apache.org/viewvc/incubator/heraldry/libraries/python/urljr/trunk/urljr/test/helper.py?view=auto&rev=463033
==============================================================================
--- incubator/heraldry/libraries/python/urljr/trunk/urljr/test/helper.py (added)
+++ incubator/heraldry/libraries/python/urljr/trunk/urljr/test/helper.py Wed Oct 11 16:08:07
2006
@@ -0,0 +1,37 @@
+import unittest
+import sys
+from os.path import dirname, join
+
+data_dir = join(dirname(__file__), 'data')
+
+def getTestDataFilename(rel_path):
+    return join(data_dir, rel_path)
+
+def getTestData(rel_path):
+    filename = getTestDataFilename(rel_path)
+    return file(filename).read()
+
+def runModule(module):
+    suite = getTestSuite(module)
+    runner = unittest.TextTestRunner()
+    result = runner.run(suite)
+    if result.wasSuccessful():
+        return 0
+    else:
+        return 1
+
+def runAsMain(module=None):
+    if module is None:
+        import __main__
+        module = __main__
+
+    sys.exit(runModule(module))
+
+def getTestSuite(module, *args, **kwargs):
+    if hasattr(module, 'getTestSuite'):
+        return module.getTestSuite(*args, **kwargs)
+    elif hasattr(module, 'getTestCases'):
+        cases = module.getTestCases(*args, **kwargs)
+        return unittest.TestSuite(cases)
+    else:
+        return unittest.TestLoader().loadTestsFromModule(module)

Added: incubator/heraldry/libraries/python/urljr/trunk/urljr/test/test_all.py
URL: http://svn.apache.org/viewvc/incubator/heraldry/libraries/python/urljr/trunk/urljr/test/test_all.py?view=auto&rev=463033
==============================================================================
--- incubator/heraldry/libraries/python/urljr/trunk/urljr/test/test_all.py (added)
+++ incubator/heraldry/libraries/python/urljr/trunk/urljr/test/test_all.py Wed Oct 11 16:08:07
2006
@@ -0,0 +1,13 @@
+import helper
+
+test_modules = ['fetchers', 'urinorm']
+
+def getTestCases():
+    tests = []
+    for module_name in test_modules:
+        module = __import__('urljr.test.test_' + module_name, {}, {}, [None])
+        tests.append(helper.getTestSuite(module))
+    return tests
+
+if __name__ == '__main__':
+    helper.runAsMain()

Added: incubator/heraldry/libraries/python/urljr/trunk/urljr/test/test_fetchers.py
URL: http://svn.apache.org/viewvc/incubator/heraldry/libraries/python/urljr/trunk/urljr/test/test_fetchers.py?view=auto&rev=463033
==============================================================================
--- incubator/heraldry/libraries/python/urljr/trunk/urljr/test/test_fetchers.py (added)
+++ incubator/heraldry/libraries/python/urljr/trunk/urljr/test/test_fetchers.py Wed Oct 11
16:08:07 2006
@@ -0,0 +1,276 @@
+import unittest
+import sys
+import urllib2
+
+import helper
+from urljr import fetchers
+
+# XXX: make these separate test cases
+
+def failUnlessResponseExpected(expected, actual):
+    assert expected.final_url == actual.final_url
+    assert expected.status == actual.status
+    assert expected.body == actual.body
+    got_headers = dict(actual.headers)
+    del got_headers['date']
+    del got_headers['server']
+    assert expected.headers == got_headers
+
+def test_fetcher(fetcher, exc, server):
+    def geturl(path):
+        return 'http://%s:%s%s' % (server.server_name,
+                                   server.socket.getsockname()[1],
+                                   path)
+
+    expected_headers = {'content-type':'text/plain'}
+
+    def plain(path, code):
+        path = '/' + path
+        expected = fetchers.HTTPResponse(
+            geturl(path), code, expected_headers, path)
+        return (path, expected)
+
+    expect_success = fetchers.HTTPResponse(
+        geturl('/success'), 200, expected_headers, '/success')
+    cases = [
+        ('/success', expect_success),
+        ('/301redirect', expect_success),
+        ('/302redirect', expect_success),
+        ('/303redirect', expect_success),
+        ('/307redirect', expect_success),
+        plain('notfound', 404),
+        plain('badreq', 400),
+        plain('forbidden', 403),
+        plain('error', 500),
+        plain('server_error', 503),
+        ]
+
+    for path, expected in cases:
+        fetch_url = geturl(path)
+        try:
+            actual = fetcher.fetch(fetch_url)
+        except (SystemExit, KeyboardInterrupt):
+            pass
+        except:
+            print fetcher, fetch_url
+            raise
+        else:
+            failUnlessResponseExpected(expected, actual)
+
+    for err_url in [geturl('/closed'),
+                    'http://invalid.janrain.com/',
+                    'not:a/url',
+                    'ftp://janrain.com/pub/']:
+        try:
+            result = fetcher.fetch(err_url)
+        except (KeyboardInterrupt, SystemExit):
+            raise
+        except fetchers.HTTPError, why:
+            # This is raised by the Curl fetcher for bad cases
+            # detected by the fetchers module, but it's a subclass of
+            # HTTPFetchingError, so we have to catch it explicitly.
+            assert exc
+        except fetchers.HTTPFetchingError, why:
+            assert not exc, (fetcher, exc, server)
+        except:
+            assert exc
+        else:
+            assert result is None, (fetcher, result)
+
+def run_fetcher_tests(server):
+    exc_fetchers = [fetchers.Urllib2Fetcher(),]
+    try:
+        exc_fetchers.append(fetchers.CurlHTTPFetcher())
+    except RuntimeError, why:
+        if why[0] == 'Cannot find pycurl library':
+            try:
+                import pycurl
+            except ImportError:
+                pass
+            else:
+                assert False, 'curl present but not detected'
+        else:
+            raise
+
+    non_exc_fetchers = []
+    for f in exc_fetchers:
+        non_exc_fetchers.append(fetchers.ExceptionWrappingFetcher(f))
+
+    for f in exc_fetchers:
+        test_fetcher(f, True, server)
+
+    for f in non_exc_fetchers:
+        test_fetcher(f, False, server)
+
+from BaseHTTPServer import BaseHTTPRequestHandler, HTTPServer
+
+class FetcherTestHandler(BaseHTTPRequestHandler):
+    cases = {
+        '/success':(200, None),
+        '/301redirect':(301, '/success'),
+        '/302redirect':(302, '/success'),
+        '/303redirect':(303, '/success'),
+        '/307redirect':(307, '/success'),
+        '/notfound':(404, None),
+        '/badreq':(400, None),
+        '/forbidden':(403, None),
+        '/error':(500, None),
+        '/server_error':(503, None),
+        }
+
+    def log_request(self, *args):
+        pass
+
+    def do_GET(self):
+        if self.path == '/closed':
+            self.wfile.close()
+        else:
+            try:
+                http_code, location = self.cases[self.path]
+            except KeyError:
+                self.errorResponse('Bad path')
+            else:
+                extra_headers = [('Content-type', 'text/plain')]
+                if location is not None:
+                    base = ('http://%s:%s' % self.server.server_address)
+                    location = base + location
+                    extra_headers.append(('Location', location))
+                self._respond(http_code, extra_headers, self.path)
+
+    def do_POST(self):
+        try:
+            http_code, extra_headers = self.cases[self.path]
+        except KeyError:
+            self.errorResponse('Bad path')
+        else:
+            if http_code in [301, 302, 303, 307]:
+                self.errorResponse()
+            else:
+                content_type = self.headers.get('content-type', 'text/plain')
+                extra_headers.append(('Content-type', content_type))
+                content_length = int(self.headers.get('Content-length', '-1'))
+                body = self.rfile.read(content_length)
+                self._respond(http_code, extra_headers, body)
+
+    def errorResponse(self, message=None):
+        req = [
+            ('HTTP method', self.command),
+            ('path', self.path),
+            ]
+        if message:
+            req.append(('message', message))
+
+        body_parts = ['Bad request:\r\n']
+        for k, v in req:
+            body_parts.append(' %s: %s\r\n' % (k, v))
+        body = ''.join(body_parts)
+        self._respond(400, [('Content-type', 'text/plain')], body)
+
+    def _respond(self, http_code, extra_headers, body):
+        self.send_response(http_code)
+        for k, v in extra_headers:
+            self.send_header(k, v)
+        self.end_headers()
+        self.wfile.write(body)
+        self.wfile.close()
+
+    def finish(self):
+        if not self.wfile.closed:
+            self.wfile.flush()
+        self.wfile.close()
+        self.rfile.close()
+
+def test():
+    import socket
+    host = socket.getfqdn('127.0.0.1')
+    # When I use port 0 here, it works for the first fetch and the
+    # next one gets connection refused.  Bummer.  So instead, pick a
+    # port that's *probably* not in use.
+    import os
+    port = (os.getpid() % 31000) + 1024
+
+    server = HTTPServer((host, port), FetcherTestHandler)
+
+    import threading
+    server_thread = threading.Thread(target=server.serve_forever)
+    server_thread.setDaemon(True)
+    server_thread.start()
+
+    run_fetcher_tests(server)
+
+class FakeFetcher(object):
+    sentinel = object()
+
+    def fetch(self, *args, **kwargs):
+        return self.sentinel
+
+class DefaultFetcherTest(unittest.TestCase):
+    def setUp(self):
+        """reset the default fetcher to None"""
+        fetchers.setDefaultFetcher(None)
+
+    def tearDown(self):
+        """reset the default fetcher to None"""
+        fetchers.setDefaultFetcher(None)
+
+    def test_getDefaultNotNone(self):
+        """Make sure that None is never returned as a default fetcher"""
+        self.failUnless(fetchers.getDefaultFetcher() is not None)
+        fetchers.setDefaultFetcher(None)
+        self.failUnless(fetchers.getDefaultFetcher() is not None)
+
+    def test_setDefault(self):
+        """Make sure the getDefaultFetcher returns the object set for
+        setDefaultFetcher"""
+        sentinel = object()
+        fetchers.setDefaultFetcher(sentinel, wrap_exceptions=False)
+        self.failUnless(fetchers.getDefaultFetcher() is sentinel)
+
+    def test_callFetch(self):
+        """Make sure that fetchers.fetch() uses the default fetcher
+        instance that was set."""
+        fetchers.setDefaultFetcher(FakeFetcher())
+        actual = fetchers.fetch('bad://url')
+        self.failUnless(actual is FakeFetcher.sentinel)
+
+    def test_wrappedByDefault(self):
+        """Make sure that the default fetcher instance wraps
+        exceptions by default"""
+        default_fetcher = fetchers.getDefaultFetcher()
+        self.failUnless(isinstance(default_fetcher,
+                                   fetchers.ExceptionWrappingFetcher),
+                        default_fetcher)
+
+        self.failUnlessRaises(fetchers.HTTPFetchingError,
+                              fetchers.fetch, 'http://invalid.janrain.com/')
+
+    def test_notWrapped(self):
+        """Make sure that if we set a non-wrapped fetcher as default,
+        it will not wrap exceptions."""
+        # A fetcher that will raise an exception when it encounters a
+        # host that will not resolve
+        fetcher = fetchers.Urllib2Fetcher()
+        fetchers.setDefaultFetcher(fetcher, wrap_exceptions=False)
+
+        self.failIf(isinstance(fetchers.getDefaultFetcher(),
+                               fetchers.ExceptionWrappingFetcher))
+
+        try:
+            fetchers.fetch('http://invalid.janrain.com/')
+        except fetchers.HTTPFetchingError:
+            self.fail('Should not be wrapping exception')
+        except:
+            exc = sys.exc_info()[1]
+            self.failUnless(isinstance(exc, urllib2.URLError), exc)
+            pass
+        else:
+            self.fail('Should have raised an exception')
+
+def getTestSuite():
+    case1 = unittest.FunctionTestCase(test)
+    loadTests = unittest.defaultTestLoader.loadTestsFromTestCase
+    case2 = loadTests(DefaultFetcherTest)
+    return unittest.TestSuite([case1, case2])
+
+if __name__ == '__main__':
+    helper.runAsMain()

Added: incubator/heraldry/libraries/python/urljr/trunk/urljr/test/test_urinorm.py
URL: http://svn.apache.org/viewvc/incubator/heraldry/libraries/python/urljr/trunk/urljr/test/test_urinorm.py?view=auto&rev=463033
==============================================================================
--- incubator/heraldry/libraries/python/urljr/trunk/urljr/test/test_urinorm.py (added)
+++ incubator/heraldry/libraries/python/urljr/trunk/urljr/test/test_urinorm.py Wed Oct 11
16:08:07 2006
@@ -0,0 +1,56 @@
+import os
+import unittest
+import urljr.urinorm
+import helper
+
+class UrinormTest(unittest.TestCase):
+    def __init__(self, desc, case, expected):
+        unittest.TestCase.__init__(self)
+        self.desc = desc
+        self.case = case
+        self.expected = expected
+
+    def shortDescription(self):
+        return self.desc
+
+    def runTest(self):
+        try:
+            actual = urljr.urinorm.urinorm(self.case)
+        except ValueError:
+            self.assertEqual(self.expected, 'fail')
+        else:
+            self.assertEqual(actual, self.expected)
+
+    def parse(cls, full_case):
+        desc, case, expected = full_case.split('\n')
+        case = unicode(case, 'utf-8')
+        
+        return cls(desc, case, expected)
+
+    parse = classmethod(parse)
+
+
+def parseTests(test_data):
+    result = []
+
+    cases = test_data.split('\n\n')
+    for case in cases:
+        case = case.strip()
+
+        if case:
+            result.append(UrinormTest.parse(case))
+
+    return result
+
+def getTestSuite():
+    here = os.path.dirname(os.path.abspath(__file__))
+    test_data_file_name = os.path.join(here, 'urinorm.txt')
+    test_data_file = file(test_data_file_name)
+    test_data = test_data_file.read()
+    test_data_file.close()
+
+    tests = parseTests(test_data)
+    return unittest.TestSuite(tests)
+
+if __name__ == '__main__':
+    helper.runAsMain()

Added: incubator/heraldry/libraries/python/urljr/trunk/urljr/test/urinorm.txt
URL: http://svn.apache.org/viewvc/incubator/heraldry/libraries/python/urljr/trunk/urljr/test/urinorm.txt?view=auto&rev=463033
==============================================================================
--- incubator/heraldry/libraries/python/urljr/trunk/urljr/test/urinorm.txt (added)
+++ incubator/heraldry/libraries/python/urljr/trunk/urljr/test/urinorm.txt Wed Oct 11 16:08:07
2006
@@ -0,0 +1,79 @@
+Already normal form
+http://example.com/
+http://example.com/
+
+Add a trailing slash
+http://example.com
+http://example.com/
+
+Remove an empty port segment
+http://example.com:/
+http://example.com/
+
+Remove a default port segment
+http://example.com:80/
+http://example.com/
+
+Capitalization in host names
+http://wWw.exaMPLE.COm/
+http://www.example.com/
+
+Capitalization in scheme names
+htTP://example.com/
+http://example.com/
+
+Capitalization in percent-escaped reserved characters
+http://example.com/foo%2cbar
+http://example.com/foo%2Cbar
+
+Unescape percent-encoded unreserved characters
+http://example.com/foo%2Dbar%2dbaz
+http://example.com/foo-bar-baz
+
+remove_dot_segments example 1
+http://example.com/a/b/c/./../../g
+http://example.com/a/g
+
+remove_dot_segments example 2
+http://example.com/mid/content=5/../6
+http://example.com/mid/6
+
+remove_dot_segments: single-dot
+http://example.com/a/./b
+http://example.com/a/b
+
+remove_dot_segments: double-dot
+http://example.com/a/../b
+http://example.com/b
+
+remove_dot_segments: leading double-dot
+http://example.com/../b
+http://example.com/b
+
+remove_dot_segments: trailing single-dot
+http://example.com/a/.
+http://example.com/a/
+
+remove_dot_segments: trailing double-dot
+http://example.com/a/..
+http://example.com/
+
+remove_dot_segments: trailing single-dot-slash
+http://example.com/a/./
+http://example.com/a/
+
+remove_dot_segments: trailing double-dot-slash
+http://example.com/a/../
+http://example.com/
+
+Test of all kinds of syntax-based normalization
+hTTPS://a/./b/../b/%63/%7bfoo%7d
+https://a/b/c/%7Bfoo%7D
+
+Unsupported scheme
+ftp://example.com/
+fail
+
+Non-absolute URI
+http:/foo
+fail
\ No newline at end of file

Added: incubator/heraldry/libraries/python/urljr/trunk/urljr/urinorm.py
URL: http://svn.apache.org/viewvc/incubator/heraldry/libraries/python/urljr/trunk/urljr/urinorm.py?view=auto&rev=463033
==============================================================================
--- incubator/heraldry/libraries/python/urljr/trunk/urljr/urinorm.py (added)
+++ incubator/heraldry/libraries/python/urljr/trunk/urljr/urinorm.py Wed Oct 11 16:08:07 2006
@@ -0,0 +1,188 @@
+import re
+
+# from appendix B of rfc 3986 (http://www.ietf.org/rfc/rfc3986.txt)
+uri_pattern = r'^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?'
+uri_re = re.compile(uri_pattern)
+
+
+authority_pattern = r'^([^@]*@)?([^:]*)(:.*)?'
+authority_re = re.compile(authority_pattern)
+
+
+pct_encoded_pattern = r'%([0-9A-Fa-f]{2})'
+pct_encoded_re = re.compile(pct_encoded_pattern)
+
+try:
+    unichr(0x10000)
+except ValueError:
+    # narrow python build
+    UCSCHAR = [
+        (0xA0, 0xD7FF),
+        (0xF900, 0xFDCF),
+        (0xFDF0, 0xFFEF),
+        ]
+
+    IPRIVATE = [
+        (0xE000, 0xF8FF),
+        ]
+else:
+    UCSCHAR = [
+        (0xA0, 0xD7FF),
+        (0xF900, 0xFDCF),
+        (0xFDF0, 0xFFEF),
+        (0x10000, 0x1FFFD),
+        (0x20000, 0x2FFFD),
+        (0x30000, 0x3FFFD),
+        (0x40000, 0x4FFFD),
+        (0x50000, 0x5FFFD),
+        (0x60000, 0x6FFFD),
+        (0x70000, 0x7FFFD),
+        (0x80000, 0x8FFFD),
+        (0x90000, 0x9FFFD),
+        (0xA0000, 0xAFFFD),
+        (0xB0000, 0xBFFFD),
+        (0xC0000, 0xCFFFD),
+        (0xD0000, 0xDFFFD),
+        (0xE1000, 0xEFFFD),
+        ]
+
+    IPRIVATE = [
+        (0xE000, 0xF8FF),
+        (0xF0000, 0xFFFFD),
+        (0x100000, 0x10FFFD),
+        ]
+
+
+_unreserved = [False] * 256
+for _ in range(ord('A'), ord('Z') + 1): _unreserved[_] = True
+for _ in range(ord('0'), ord('9') + 1): _unreserved[_] = True
+for _ in range(ord('a'), ord('z') + 1): _unreserved[_] = True
+_unreserved[ord('-')] = True
+_unreserved[ord('.')] = True
+_unreserved[ord('_')] = True
+_unreserved[ord('~')] = True
+
+
+_escapeme_re = re.compile('[%s]' % (''.join(
+    map(lambda (m, n): u'%s-%s' % (unichr(m), unichr(n)),
+        UCSCHAR + IPRIVATE)),))
+
+
+def _pct_escape_unicode(char_match):
+    c = char_match.group()
+    return ''.join(['%%%X' % (ord(octet),) for octet in c.encode('utf-8')])
+
+
+def _pct_encoded_replace_unreserved(mo):
+    try:
+        i = int(mo.group(1), 16)
+        if _unreserved[i]:
+            return chr(i)
+        else:
+            return mo.group().upper()
+
+    except ValueError:
+        return mo.group()
+
+
+def _pct_encoded_replace(mo):
+    try:
+        return chr(int(mo.group(1), 16))
+    except ValueError:
+        return mo.group()
+
+
+def remove_dot_segments(path):
+    result_segments = []
+    
+    while path:
+        if path.startswith('../'):
+            path = path[3:]
+        elif path.startswith('./'):
+            path = path[2:]
+        elif path.startswith('/./'):
+            path = path[2:]
+        elif path == '/.':
+            path = '/'
+        elif path.startswith('/../'):
+            path = path[3:]
+            if result_segments:
+                result_segments.pop()
+        elif path == '/..':
+            path = '/'
+            if result_segments:
+                result_segments.pop()
+        elif path == '..' or path == '.':
+            path = ''
+        else:
+            i = 0
+            if path[0] == '/':
+                i = 1
+            i = path.find('/', i)
+            if i == -1:
+                i = len(path)
+            result_segments.append(path[:i])
+            path = path[i:]
+            
+    return ''.join(result_segments)
+
+
+def urinorm(uri):
+    if isinstance(uri, unicode):
+        uri = _escapeme_re.sub(_pct_escape_unicode, uri).encode('ascii')
+
+    uri_mo = uri_re.match(uri)
+
+    scheme = uri_mo.group(2)
+    if scheme is None:
+        raise ValueError('No scheme specified')
+
+    scheme = scheme.lower()
+    if scheme not in ('http', 'https'):
+        raise ValueError('Not an absolute HTTP or HTTPS URI: %r' % (uri,))
+
+    authority = uri_mo.group(4)
+    if authority is None:
+        raise ValueError('Not an absolute URI: %r' % (uri,))
+
+    authority_mo = authority_re.match(authority)
+    if authority_mo is None:
+        raise ValueError('URI does not have a valid authority: %r' % (uri,))
+
+    userinfo, host, port = authority_mo.groups()
+
+    if userinfo is None:
+        userinfo = ''
+
+    if '%' in host:
+        host = host.lower()
+        host = pct_encoded_re.sub(_pct_encoded_replace, host)
+        host = unicode(host, 'utf-8').encode('idna')
+    else:
+        host = host.lower()
+
+    if port:
+        if (port == ':' or
+            (scheme == 'http' and port == ':80') or
+            (scheme == 'https' and port == ':443')):
+            port = ''
+    else:
+        port = ''
+
+    authority = userinfo + host + port
+
+    path = uri_mo.group(5)
+    path = pct_encoded_re.sub(_pct_encoded_replace_unreserved, path)
+    path = remove_dot_segments(path)
+    if not path:
+        path = '/'
+
+    query = uri_mo.group(6)
+    if query is None:
+        query = ''
+
+    fragment = uri_mo.group(8)
+    if fragment is None:
+        fragment = ''
+
+    return scheme + '://' + authority + path + query + fragment



Mime
View raw message