incubator-couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Antony Blakey <>
Subject Re: slash escaping (was 0.9.0 Release)
Date Thu, 11 Dec 2008 20:28:25 GMT

On 12/12/2008, at 6:22 AM, Damien Katz wrote:

> On Dec 11, 2008, at 2:39 PM, Chris Anderson wrote:
>> On Wed, Dec 3, 2008 at 3:59 PM, Antony Blakey < 
>> > wrote:
>>> On 04/12/2008, at 9:55 AM, Chris Anderson wrote:
>>>> On Wed, Dec 3, 2008 at 6:09 AM, Adam Kocoloski <

>>>> >
>>>> wrote:
>>>>> 2) The "/" in the _design doc ID is confusing.
>>>> Oh someone, please make it easy! (and correct)
>>> Someone please make it absolutely, 100%, correct.
>> The more I program against Couch, especially in a browser, the more I
>> run into issues where different parts of the toolchain tend toward
>> auto-unescaping %2F. It's hard to be certain that I've got something
>> absolutely, 100% correct, but we'll never get there if we don't  
>> start.
>> Here are some examples which assume that docid's slashes will be
>> urlencoded (unless the docid starts with '_'). This is the current
>> rule (roughly). Each example has 2 urls with attachments that have no
>> slashes in the name, followed by a url with an attachment with
>> multiple slashes. I think it is feasible to allow this sort of thing
>> to happen, by putting a little bit of special-case logic in the
>> routing code. I don't think doing so breaks anything fundamental  
>> about
>> CouchDB.
>> regular docs:
>> /db/docid
>> /db/docid/afile
>> /db/docid/afile/with/nested/slashes
>> design docs:
>> /db/_design/name
>> /db/_design/name/afile
>> /db/_design/name/afile/with/nested/slashes
>> If your docid does not start with '_' (eg not a local or design doc)
>> then any slashes in the docid would have to be escaped. This is so we
>> can know when attachment addressing begins. Also, design docs with
>> slashes after the inital one (slashes in the name) would have to
>> escape them.
>> regular doc with slashes in id:
>> /db/docid%2Fwith%2Fslashes
>> /db/docid%2Fwith%2Fslashes/afile
>> /db/docid%2Fwith%2Fslashes/afile/with/nested/slashes
>> design doc with slashes in name:
>> /db/_design/name%2Fwith%2Fslashes
>> /db/_design/name%2Fwith%2Fslashes/afile
>> /db/_design/name%2Fwith%2Fslashes/afile/with/nested/slashes
>>> Special names, special paths, sometimes encoding, sometimes not.  
>>> Such magic
>>> is evil because it always comes back to bite your arse.
>> I think I may have this correct - eg non arse biting. But I'm posting
>> to the dev list because y'all might see what I don't.
>> I plan to put this into trunk before 1.0 (I think it will be  
>> backwards
>> compatible). Comments?
>> Chris
>> -- 
>> Chris Anderson
> I agree with everything but slashes in design doc named.

So the guidance is that users must not use document names starting  
with '_' if they want to avoid astonishment?

The other alternate is to always require the component after the db to  
be 'special' i.e. document URLs could be


No special rules required. IMO this example makes clear the cause of  
the issue.

> I think we probably shouldn't support design docs with slashes, and  
> maybe all other weird characters.

I think all document names should be Unicode.

> For one thing, we use the design doc name as the file name for the  
> view index file for the views. This is an issue that can prove  
> problematic on certain platforms and not others.

The file name can be escaped. There are also limitations on the length  
of the filename depending on the platform. I suggest using an escaped  
form of some initial segment of the name, concatenated with an escaped  
form of some final segment of the name, concatenated wit a hash of the  
full name.

If the name is less than a certain length, then just escape the full  

Also, provide a handler that returns a json document associating  
filenames with the original name. This exposes the mapping  
implementation in way that can be used by developers. Maybe also a  
handler to map from an arbitrary string to a filename, using couch's  
mapping function. Useful for plugin/_external authors who want to use  
local files.

IMO, limiting the names of things because of filesystem limitations is  
a bad example of abstraction leakage.

> If the design doc has weird characters that aren't supported in the  
> file system, we can't make the index file. If we hash the filename,  
> then it's impossible for an admin to figure out which files are  
> which from the command line. So maybe we should url escape the name  
> for the file system too. Or just not support weird characters at all.

Antony Blakey
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

If at first you don’t succeed, try, try again. Then quit. No use being  
a damn fool about it
   -- W.C. Fields

View raw message