incubator-couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Antony Blakey <antony.bla...@gmail.com>
Subject Re: slash escaping (was 0.9.0 Release)
Date Thu, 11 Dec 2008 20:28:25 GMT

On 12/12/2008, at 6:22 AM, Damien Katz wrote:

>
> On Dec 11, 2008, at 2:39 PM, Chris Anderson wrote:
>
>> On Wed, Dec 3, 2008 at 3:59 PM, Antony Blakey <antony.blakey@gmail.com 
>> > wrote:
>>>
>>> On 04/12/2008, at 9:55 AM, Chris Anderson wrote:
>>>
>>>> On Wed, Dec 3, 2008 at 6:09 AM, Adam Kocoloski <adam.kocoloski@gmail.com

>>>> >
>>>> wrote:
>>>>>
>>>>> 2) The "/" in the _design doc ID is confusing.
>>>>
>>>> Oh someone, please make it easy! (and correct)
>>>
>>> Someone please make it absolutely, 100%, correct.
>>>
>>
>> The more I program against Couch, especially in a browser, the more I
>> run into issues where different parts of the toolchain tend toward
>> auto-unescaping %2F. It's hard to be certain that I've got something
>> absolutely, 100% correct, but we'll never get there if we don't  
>> start.
>>
>> Here are some examples which assume that docid's slashes will be
>> urlencoded (unless the docid starts with '_'). This is the current
>> rule (roughly). Each example has 2 urls with attachments that have no
>> slashes in the name, followed by a url with an attachment with
>> multiple slashes. I think it is feasible to allow this sort of thing
>> to happen, by putting a little bit of special-case logic in the
>> routing code. I don't think doing so breaks anything fundamental  
>> about
>> CouchDB.
>>
>> regular docs:
>>
>> /db/docid
>> /db/docid/afile
>> /db/docid/afile/with/nested/slashes
>>
>> design docs:
>>
>> /db/_design/name
>> /db/_design/name/afile
>> /db/_design/name/afile/with/nested/slashes
>>
>>
>> If your docid does not start with '_' (eg not a local or design doc)
>> then any slashes in the docid would have to be escaped. This is so we
>> can know when attachment addressing begins. Also, design docs with
>> slashes after the inital one (slashes in the name) would have to
>> escape them.
>>
>> regular doc with slashes in id:
>>
>> /db/docid%2Fwith%2Fslashes
>> /db/docid%2Fwith%2Fslashes/afile
>> /db/docid%2Fwith%2Fslashes/afile/with/nested/slashes
>>
>> design doc with slashes in name:
>>
>> /db/_design/name%2Fwith%2Fslashes
>> /db/_design/name%2Fwith%2Fslashes/afile
>> /db/_design/name%2Fwith%2Fslashes/afile/with/nested/slashes
>>
>>> Special names, special paths, sometimes encoding, sometimes not.  
>>> Such magic
>>> is evil because it always comes back to bite your arse.
>>>
>>
>> I think I may have this correct - eg non arse biting. But I'm posting
>> to the dev list because y'all might see what I don't.
>>
>> I plan to put this into trunk before 1.0 (I think it will be  
>> backwards
>> compatible). Comments?
>>
>> Chris
>>
>> -- 
>> Chris Anderson
>> http://jchris.mfdz.com
>
>
> I agree with everything but slashes in design doc named.

So the guidance is that users must not use document names starting  
with '_' if they want to avoid astonishment?

The other alternate is to always require the component after the db to  
be 'special' i.e. document URLs could be

   /db/_/docid%2Fwith%2Fslashes/afile/with/nested/slashes

No special rules required. IMO this example makes clear the cause of  
the issue.

> I think we probably shouldn't support design docs with slashes, and  
> maybe all other weird characters.

I think all document names should be Unicode.

> For one thing, we use the design doc name as the file name for the  
> view index file for the views. This is an issue that can prove  
> problematic on certain platforms and not others.

The file name can be escaped. There are also limitations on the length  
of the filename depending on the platform. I suggest using an escaped  
form of some initial segment of the name, concatenated with an escaped  
form of some final segment of the name, concatenated wit a hash of the  
full name.

If the name is less than a certain length, then just escape the full  
name.

Also, provide a handler that returns a json document associating  
filenames with the original name. This exposes the mapping  
implementation in way that can be used by developers. Maybe also a  
handler to map from an arbitrary string to a filename, using couch's  
mapping function. Useful for plugin/_external authors who want to use  
local files.

IMO, limiting the names of things because of filesystem limitations is  
a bad example of abstraction leakage.

> If the design doc has weird characters that aren't supported in the  
> file system, we can't make the index file. If we hash the filename,  
> then it's impossible for an admin to figure out which files are  
> which from the command line. So maybe we should url escape the name  
> for the file system too. Or just not support weird characters at all.

Antony Blakey
--------------------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

If at first you don’t succeed, try, try again. Then quit. No use being  
a damn fool about it
   -- W.C. Fields


Mime
View raw message