cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefano Mazzocchi <stef...@apache.org>
Subject Re: [RT] Moving towards a new documentation system
Date Sat, 11 Oct 2003 15:19:38 GMT

On Saturday, Oct 11, 2003, at 16:20 Europe/Rome, Bertrand Delacretaz 
wrote:

> Le Samedi, 11 oct 2003, à 15:33 Europe/Zurich, Stefano Mazzocchi a 
> écrit :
>
>>
>> On Saturday, Oct 11, 2003, at 14:58 Europe/Rome, Bertrand Delacretaz 
>> wrote:
>>> ...How about naming files like
>>>
>>>   3948494-some-descriptive-name-for-humans-here.xml
>>
>> It's like suggesting to have a BugID "39484-my-file-can't-be-found" 
>> as the primary key of the bug table in mysql, just because people 
>> might want to edit bugs by hand inside the database!!
>>
>> When you have bug emails, you get the bug ID which is unique and 
>> semanticless....
>
> You usually get the bug ID *and* the title, which lets you decide 
> whether you're interested in it or not without having to look up the 
> ID.

yeah, but my point is that we should make it hard for people to edit 
stuff in the repository directly. Accessing a WebDAV repository 
directly should be considered a side access for administration 
purposes, not a direct interface (unless there is a webdav-app in 
between, but that's another story)

just like you use the bugzilla frontend to edit the bugs, you don't do 
it by hand by editing tables because you don't know how many things 
could be triggered in the database by changing one table.

a repository is just like a database: editing a database by direct SQL 
injection is silly. Today it doesn't look silly because repositories 
are *much* less functional than a database, but when you have a 
*serious* repository (for example, one that can extract properties from 
an image and provide an RDF representation for it), editing it *by 
hand* would be silly.

In this context, having a file with a numerical name, is just like 
having a node in a JSR 170 repository with a unique UUID... which is 
the basis for having the node linkable.

but from a higher point of view, a LO is actually identified by a 
number (or the timestamp of creation, anything unique and that can last 
forever)... if you add a semantically meaningful name, this means that 
you have to rely on that name to still be semantically meaningful in 
the future... and different enough to allow to have thousands of 
documents without incurring into name collisions.

I think it's just easier to use a number.

>> ...Think about TCP/IP: instead of placing a human identifier at the 
>> IP level, they used a lookup mechanism. This is exactly the paradigm 
>> that we should follow, IMO.
>
> Agreed, provided the usability of this lookup is good enough to:
> -Easily find out what learning object a CVS (or other "change event") 
> message is about

not harder than finding out what bug report bugid 23494 is about. it 
would be enough to access the URI of the LO as a URL, thus clicking on 
the URI would yield a browsable view of the LO. I can't think of 
anything simpler than this, not even if it had a semantic name.

> -Easily select a learning object for editing, review, etc, without 
> needing complex tools

a "search by LOID" should be enough.

> Also, the comparison with TCP/IP brings another idea, instead of using 
> big numbers for IDs they could be split like IP addresses to make them 
> more readable.
>
> Some people have a hard time reading long chains of numbers, I am one 
> of these and for me
>
>   2003.332.221
>
> is much more easier to read (and to spell out) than
>
>   2003332221

do you seriously think we'll get to 2 billion learning objects? that's 
wishful thinking ;-)

> Where I have a hard time figuring out the number of 3's and 2's 
> (you'll see when you are my age ;-)

I have no problems in whatever ID schema we use, even something similar 
to ISBN or UUID or tre-numbers-dot format of IP addresses, anything is 
good as long as it doesn't overlap concerns about identification and 
titleing.

> I'm using 2003 in front as starting with the year in which the ID was 
> assigned gives some useful context. Mixing concerns, I know, but also 
> makes for an easy way of splitting LOs in subdirectories for storage, 
> to avoid having millions of files in a single directory.

nah, this is not a problem for future repositories like JSR170, don't 
worr.

please people, *STOP* thinking at those things as files and as a 
repository (CVS, WEbDAV, whatever) as a thin layer on top of a file 
system... these are just implementation details.

>> ...Messy. what would something like this behave?
>>
>>  22003-this-is-first-doc.xml
>>  22003-this-is-second-doc.xml
>> ...
>
> that's what I meant by the system having to ensure the uniqueness of 
> IDs. It is certainly problematic.

yep

> I agree that a pure ID for naming pieces  of content might be better, 
> provided lookup is super-easy and doesn't get in the way of editing, 
> keeping track of changes etc., and the ID's stay readable and 
> "communicable".

+1 to these goals.

--
Stefano.


Mime
View raw message