incubator-jspwiki-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Murray Altheim <murra...@altheim.com>
Subject Re: Metadata in 3.0 [Was: JSPWiki 3 design notes]
Date Tue, 05 Feb 2008 21:03:28 GMT
Janne Jalkanen wrote:
>> I don't think it will. There's a core set of fields but their names
>> should probably be abstractions. I'm trying to think through how this
>> might work without loads of problems. There's so many applications
>> for JSPWiki (in terms of how it might fit into other applications)
>> that we'll need to fit into others' metadata schemes. What I'm
>> talking about are really surface names for things.
> 
> Yes, it will.  If the provider has to figure out mapping between 
> different concepts in the database, it'll create problems.

Not if the provider is using the same identifiers as everything else,
with all this determined prior to anything firing up. Basically the
idea is that there's an abstract metadata schema, with a reference
implementation. People could add additional, but not remove the core.
There'd be no confusion for a given installation.

> This is exactly why namespaces were invented, and this is also why it 
> would probably be a better idea NOT to reuse Dublin Core, but to stick 
> to our own schema.

Oh, I agree -- I think all of the fields in the abstract or API schema
would be wiki:, such that if a given implementor (such as me) wants to
use DC, any reference to say, dc:creator is automatically mapped to
wiki:author (or whatever it is) in the backend. I'm not suggesting that
a transformation happen on the fly, as if one could change the schema
of an existing installation. This would be a configuration issue, and
I've already got a pretty good idea how to do it.

>> Well, yes, but also having the field names match a given schema. Maybe
>> some kind of transformation feature, dunno.
> 
> I think namespaces are quite enough for us.  I don't really want to code 
> for the case in case someone wants to use "wiki:author" for some other 
> purpose.

They wouldn't be able to since the backend uses wiki:author. But if they
provided a mapping so that dc:creator was considered the same as wiki:author,
they could use (potentially) either.

> If people want, they *can* rewrite their own backend in such a way that 
> in converts everything into paper notes stuck onto a donkey glued to a 
> wall somewhere in Pakistan with the word "CUCKOO" written on the 
> backside - but after the JCR interface, I don't really care what 
> transformations you do.

Again, the transformations aren't dynamic, they're part of config.

>>>> Well, I also mentioned that I really doubt that I'd be using 
>>>> dc:identifier
>> for those purposes within the JSPWiki metadata profile. I can also see
>> creating a suitable ID within our own namespace, but I really think
>> dc:identifier would suit fine. We'd not be abusing it at all.
> 
> Ah yes, now I found it.  From RFC 5013:
> 
[...]> <snip>

> </snip>
> 
> I like atom:id much more than the dc:identifier, because
> a) [...]
> Since atom:id is a machine-processable entity, having clear, 
> machine-understandable rules as to what it really is, is very, very 
> important.  For dc:identifier, it's pretty much handwaving.

No, it's not. That's what you use an application profile to define.
DC is designed for broad interoperability across thousands of systems,
but if you want to constrain things or use particularly encodings,
you do that in an application profile. The profile has to be in accord
with DC, IOW it is stricter than DC. DC has to be lenient by design.

>> Not that I'm aware of. DC doesn't get into that kind of thing much
>> except when you get to things like dates.
> 
> I would actually like to use the atom:person construct here, since it 
> has better semantics (it adds an IRI to a name, which can be useful in 
> figuring out across wikis who actually authored what).  But it might be 
> easier to just to store a local identifier, in which case dc is as good 
> as any.
> 
>> It certainly suits the role of both dc:creator, editor, translator,
>> etc. (i.e., very general purpose), anyone who contributes to the
>> resource.
> 
> But again, the definition is a bit handwavy.

Again, this is the place for an application profile.

>>>> Recommendation: Use DCTERMS.format. This is the term used to contain
>>>> a format identifier.  While I recognise that these discussions tend to
>>> I would need to check if it's okay.
>>
>> That one is pretty common.
> 
> Unfortunately, it just says that the "best practice" is to use something 
> like MIME.  Now the problem is that in order to consider e.g. data 
> portability, there's no way to say that "this dcterms:format" means a 
> MIME type.  So again, a system processing the information needs to 
> resort to context-sensitive processing (e.g. "ok, so this comes from 
> jspwiki, so it's always a MIME type").    Which isn't really very good.  
> This is why I would like to have an unambigous "wiki:contentType" 
> definition, which can also be reflected in a non-modifiable 
> pseudoproperty "dcterms:format".
> 
> E.g. "wiki:contentType contains a STRING, which denotes the MIME content 
> type of the content as defined in RFC XXXX [MIME]."
> 
> For example, if it's just defined as a String, how do you define 
> equivalence rules?  Is it okay to put in IMAGE/JPG, or ImAgE/jpG, or 
> image/jpg? If you do not know that these are MIME types, and RFC XXXX 
> defines MIME comparison as case-insensitive, then your application might 
> be functioning wrong.
> 
> This is really my gripe with Dublin Core - it leaves too much up for 
> interpretation.  Which makes it really good for people, but cumbersome 
> for computers.

Again, application profile.

>> It's a Big Deal for a lot of people, I probably don't care much either.
>> I use 'text/wiki' for general purpose wiki text and the application
>> one above to specifically tag JSPWiki wiki text.
> 
> I don't think you can use text/wiki - it's missing the "x-" ;-)

Oops. My fault. There should have been an "x-" in there.

To reiterate, what I'd suggest is that we define an abstract metadata
schema in our own namespace, with as tight a set of constraints and
definitions as we need to function. That's what the system uses. I
can *still* write that up as an application profile for DC, using DC
terms where they make sense and putting anything else in as extensions.
The actual namespace of the definitions would be wiki:, but it would
be dc: compatible under the covers. Then, the reference implementation
would be wiki: but permit mapping to whatever a user required for their
own systems. In my case I'd use the reference implementation for most
of my projects, but in cases where I've got to work within an existing
(e.g., library) CMS or other system, I'd configure that wiki instance
to use a dc: set of terms instead of wiki:. I'm pretty sure this is
workable.

Murray

...........................................................................
Murray Altheim <murray07 at altheim.com>                           ===  = =
http://www.altheim.com/murray/                                     = =  ===
SGML Grease Monkey, Banjo Player, Wantanabe Zen Monk               = =  = =

       Boundless wind and moon - the eye within eyes,
       Inexhaustible heaven and earth - the light beyond light,
       The willow dark, the flower bright - ten thousand houses,
       Knock at any door - there's one who will respond.
                                       -- The Blue Cliff Record

Mime
View raw message