Mailing-List: contact dev-help@cocoon.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@cocoon.apache.org
Date: Sat, 11 Oct 2003 17:19:38 +0200
Subject: Re: [RT] Moving towards a new documentation system
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Mime-Version: 1.0 (Apple Message framework v552)
From: Stefano Mazzocchi <stefano@apache.org>
To: dev@cocoon.apache.org
Content-Transfer-Encoding: quoted-printable
In-Reply-To: <159AFE5C-FBF6-11D7-AD04-000393CFE402@apache.org>
Message-Id: <54483F20-FBFE-11D7-8572-000393D2CB02@apache.org>


On Saturday, Oct 11, 2003, at 16:20 Europe/Rome, Bertrand Delacretaz=20
wrote:

> Le Samedi, 11 oct 2003, =E0 15:33 Europe/Zurich, Stefano Mazzocchi a=20=

> =E9crit :
>
>>
>> On Saturday, Oct 11, 2003, at 14:58 Europe/Rome, Bertrand Delacretaz=20=

>> wrote:
>>> ...How about naming files like
>>>
>>>   3948494-some-descriptive-name-for-humans-here.xml
>>
>> It's like suggesting to have a BugID "39484-my-file-can't-be-found"=20=

>> as the primary key of the bug table in mysql, just because people=20
>> might want to edit bugs by hand inside the database!!
>>
>> When you have bug emails, you get the bug ID which is unique and=20
>> semanticless....
>
> You usually get the bug ID *and* the title, which lets you decide=20
> whether you're interested in it or not without having to look up the=20=

> ID.

yeah, but my point is that we should make it hard for people to edit=20
stuff in the repository directly. Accessing a WebDAV repository=20
directly should be considered a side access for administration=20
purposes, not a direct interface (unless there is a webdav-app in=20
between, but that's another story)

just like you use the bugzilla frontend to edit the bugs, you don't do=20=

it by hand by editing tables because you don't know how many things=20
could be triggered in the database by changing one table.

a repository is just like a database: editing a database by direct SQL=20=

injection is silly. Today it doesn't look silly because repositories=20
are *much* less functional than a database, but when you have a=20
*serious* repository (for example, one that can extract properties from=20=

an image and provide an RDF representation for it), editing it *by=20
hand* would be silly.

In this context, having a file with a numerical name, is just like=20
having a node in a JSR 170 repository with a unique UUID... which is=20
the basis for having the node linkable.

but from a higher point of view, a LO is actually identified by a=20
number (or the timestamp of creation, anything unique and that can last=20=

forever)... if you add a semantically meaningful name, this means that=20=

you have to rely on that name to still be semantically meaningful in=20
the future... and different enough to allow to have thousands of=20
documents without incurring into name collisions.

I think it's just easier to use a number.

>> ...Think about TCP/IP: instead of placing a human identifier at the=20=

>> IP level, they used a lookup mechanism. This is exactly the paradigm=20=

>> that we should follow, IMO.
>
> Agreed, provided the usability of this lookup is good enough to:
> -Easily find out what learning object a CVS (or other "change event")=20=

> message is about

not harder than finding out what bug report bugid 23494 is about. it=20
would be enough to access the URI of the LO as a URL, thus clicking on=20=

the URI would yield a browsable view of the LO. I can't think of=20
anything simpler than this, not even if it had a semantic name.

> -Easily select a learning object for editing, review, etc, without=20
> needing complex tools

a "search by LOID" should be enough.

> Also, the comparison with TCP/IP brings another idea, instead of using=20=

> big numbers for IDs they could be split like IP addresses to make them=20=

> more readable.
>
> Some people have a hard time reading long chains of numbers, I am one=20=

> of these and for me
>
>   2003.332.221
>
> is much more easier to read (and to spell out) than
>
>   2003332221

do you seriously think we'll get to 2 billion learning objects? that's=20=

wishful thinking ;-)

> Where I have a hard time figuring out the number of 3's and 2's=20
> (you'll see when you are my age ;-)

I have no problems in whatever ID schema we use, even something similar=20=

to ISBN or UUID or tre-numbers-dot format of IP addresses, anything is=20=

good as long as it doesn't overlap concerns about identification and=20
titleing.

> I'm using 2003 in front as starting with the year in which the ID was=20=

> assigned gives some useful context. Mixing concerns, I know, but also=20=

> makes for an easy way of splitting LOs in subdirectories for storage,=20=

> to avoid having millions of files in a single directory.

nah, this is not a problem for future repositories like JSR170, don't=20
worr.

please people, *STOP* thinking at those things as files and as a=20
repository (CVS, WEbDAV, whatever) as a thin layer on top of a file=20
system... these are just implementation details.

>> ...Messy. what would something like this behave?
>>
>>  22003-this-is-first-doc.xml
>>  22003-this-is-second-doc.xml
>> ...
>
> that's what I meant by the system having to ensure the uniqueness of=20=

> IDs. It is certainly problematic.

yep

> I agree that a pure ID for naming pieces  of content might be better,=20=

> provided lookup is super-easy and doesn't get in the way of editing,=20=

> keeping track of changes etc., and the ID's stay readable and=20
> "communicable".

+1 to these goals.

--
Stefano.