quetz-mod_python-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sébastien Arnaud <arnau...@emedialibrary.org>
Subject Re: Regex based publisher proposal
Date Fri, 08 Sep 2006 04:53:47 GMT
Thank you all for taking the time to read my code, I apologize for  
not including a better description of what I intended to do, but I  
was reluctant to post a very very long message to the list to start  
with, so I am going to blame my poor editing skills and the late hour  
I sent this email at ;)

In short, Nicolas did read my mind correctly in regards to what I am  
attempting to do here. I have searched and searched like many python  
developers for a proper web framework and I have settled for  
mod_python about 2 years ago. I tried afterwards the new "popular"  
ones such as Ruby on Rails, TurboGears, Django, but not really liked  
those because mostly with their integration with SQL Objects type DB  
abstraction, which I never bought into, and their dependence on  
specific templating systems. Nevertheless, they did introduce some  
interesting concepts (such as regex url mapping), which I am  
attempting here (poorly it seems) to implement directly in mod_python.

To put things back into context, the code I submitted was a simple  
draft from a passion impulse over the labor day week-end and was  
never (in my mind) a solid implementation but more something to  
illustrate what could maybe done and open a discussion.

To review quickly the code comments though, the mapper_cache object  
is not thread safe yet, and I added only later the ability to define  
your regex as a PythonOption argument and definetly forgot to  
consider what could happen when two different parts of the URL  
namespace use different regex's and don't execute in different Python  
interpreter instances.

In regards to:

path,module_name =  os.path.split(req.filename)

     # Trimming the front part of req.uri
     if module_name=='':
         req_url = ''
         req_url = req.uri[req.uri.index(module_name):]

It is bringing a question I have had for a long time in mod_python.  
Is there a way to identify accurately the relative virtual path of a  
request other than comparing to the physical path, especially when  
using apache definition such as VirtualHost & Location?  For example  
let's say you define something like:
<Location /myapp/admin/>
	AddHandler mod_python .py .html
	PythonHandler mod_python.publisher

And there comes a request such as http://myserver/myapp/admin/login.html

How do you determine accurately that the part of the path /myapp/ 
admin/ is in fact the virtual root of your application ?

Also, a little bit trickier when you get something like: http:// 

1) Is the only way to pass along all requests (not filtered by  
extensions) to your mod_python handler is to use something like:
	SetHandler python-program
	PythonHandler myhandler

2) In this situation determining the virtual root of your application  
is crucial to be able to separate the regular path from the potential  
parameters/actions so how would you go about it? I have in the past  
used the cheap trick of passing along in a PythonOption the virtual  
path of my app, but there must be a better way. I have still failed  
though to find in the mod_python docs a way to read the properties of  
an apache <location> or <virtual host> directive in order to  
determine automatically the proper virtual root of my handler.

Graham, I chose to re-embedded the code of mod_python publisher,  
mainly because I figured I would have to modify some other parts, if  
I want to not only publish a function, but in fact publish a callable  
class (which would derive from a future mywebframework.action class).  
Correct me if I am wrong, but right now it is not possible to publish  
a callable class in a module with mod_python publisher.

Jorey, I dig your implementation without using regex, but here I am  
really trying to use regex. I also thought that performance would be  
an issue, but on some of my early benchmarks I did not get any worse  
performance than the current mod_python.publisher method. So, I think  
that if implemented properly performance might not even be drawback  
of using regex.

Finally, all I am really trying to build here is a drop-in  
replacement for mod_python.publisher by providing by default the same  
functionality, but to offer the flexibility found in newer web  
frameworks via regex mapping and eventually implement a reverse url  
mapping method, just like Nicolas described it.

I have looked closely at routes, but I did not want all the features  
they implemented and did not like the idea of adding an external  
dependency just for the routing/mapping process.

I strongly believe in a simple mod_python web framework that would  
arm developers with essential tools such as:
* regex based routing with a reverve mapping method
* web object oriented (a request maps to a class instanced for the call)
* regex based parameter validation (to insure the good form of data  
passed along )
* DB connection pooling
* Generic rendering template system (empowering the developer to use  
virtually ANY templating system avail in python)
* Remote DB Sessions (for scalability)
* Advanced Debugging
* Close integration of apache features exposed by mod_python



On Sep 7, 2006, at 1:17 AM, Graham Dumpleton wrote:

> On 07/09/2006, at 2:59 PM, Sébastien Arnaud wrote:
>> Anyway, please share your comments and feedback to make sure I am  
>> headed in the right direction by keeping in mind that my first  
>> goal is to be able to publish using a defined regex url grammar a  
>> callable class within a module. I believe that once this first  
>> step is accomplished the real design of the web framework can begin.
> A few comments while I work out what your code actually does.
> class Mapper:
> 	""" This is the object to cache the regex engine """
> 	regex = "(?P<controller>[\w]+)?(\.(?P<extension>[\w]+))?(/(? 
> P<action>[^/]+))?(\?$)?"
> 	regex_compared = 0
> 	def __init__(self):
> 		self.reobj = re.compile(self.regex)
> 	def __call__(self, uri, cre):
> 		if(cre!=None and not self.regex_compared and cre!=self.regex):
> 			self.regex = cre
> 			self.reobj = re.compile(self.regex)
> 			self.regex_compared = 1
> 		m = self.reobj.match(uri)
> 		if m:
> 			return (m.group('controller'), m.group('extension'), m.group 
> ('action'))
> 		else:
> 			return (None, None, None)
> mapper_cache = Mapper()
> This is not thread safe and use in a multithreaded MPM, ie., winnt  
> and worker, may
> result in failure. I also suspect if you would have problems where  
> two different parts
> of the URL namespace use different regex's and they aren't  
> executing in different
> Python interpreter instances.
>     path,module_name =  os.path.split(req.filename)
>     # Trimming the front part of req.uri
>     if module_name=='':
>         req_url = ''
>     else:
>         req_url = req.uri[req.uri.index(module_name):]
> This is not a very robust way of doing this can technically could  
> fail in certain cases.
>     # Now determine the actual Python module code file
>     # to load. This will first try looking for the file
>     # '/path/<module_name>.py'.
>     req.filename = path + '/' + controller + '.py'
>     if not exists(req.filename):
>         raise apache.SERVER_RETURN, apache.HTTP_NOT_FOUND
> I am not really sure why you go to all this trouble. For the way  
> the default regex
> is written, this could possibly just as easily be achieved using  
> standard mod_python.publisher,
> using subdirectories in document tree and use of MultiViews  
> matching in an appropriate
> way.
> In other words, am not convinced that your code is required at all  
> and you may be
> able to achieve the same thing as default regex using standard  
> mod_python.publisher.
> At worst case, you might need to use Apache RewriteRule. In  
> mod_python 3.3, you
> could probably do all this with a very simple fixup handler as well.
> Thus, as already requested, can you actually supply some examples  
> of how this is
> used in practice.
> BTW, you could also have done this by using a wrapper handler  
> around the
> existing mod_python.publisher handler as well, thereby avoiding  
> having to cut and
> paste all the code.
> Graham

View raw message