forrest-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nicola Ken Barozzi <nicola...@apache.org>
Subject Re: [RT] Per document skinconf (was Re: coloring table cells [from the user list])
Date Fri, 25 Feb 2005 10:33:42 GMT
Ross Gardler wrote:
> Nicola Ken Barozzi wrote:
...
>> Why not insert the metadata in the file itself like I proposed? I 
>> prefer to keep faith to the 1 file -> one output rule.
> 
> I guess our different views here are because of different use cases. You 
> seem to be assuming that you only ever want the meta-data and the 
> content together. That is not the case for me.
> 
> Meta-data is often processed independently of actual data. For example, 
> a meta-data harvester is not interested in content. Of course, this 
> would still be possible if it were all in the same file, but performance 
> would suffer.

So it's a technical and not a design issue. I tend to optimize later 
(and sometimes get bitten ;-)

> In some use cases this performance bottleneck will become an issue, for 
> example I have Learning Objects that consist of over 600 pages, each 
> with an average of around 4000 characters. Each page has an additional 
> 1000 characters of meta data (not including XML elements). The nature of 
> XML is that you need to process the whole file even if you only want one 
> element. This means we are processing 5000 characters of data instead of 
> 1000, when harvesting 600 pages that is 300,000 characters of data that 
> we don't want (and we are still ignoring the XML elements).

Well, it's not true, as the stremaing parser can stop processing at any 
time. This is how we get the doctype.

> Furthermore, meta-data tends to be highly structured and would therefore 
> benefit from being stored in a relational database rather than an XML 
> one. 

Hmmm, this is an interesting point.

My use case for metadata is title, author, etc. You have a much more 
complex use case. I now start to understand more of your POV.

> Meta-data is often the subject of complex queries and the speed of 
> a relational database is useful here. If the meta-data is stored inside 
> the content file then this can only be done by duplicating the data 
> across two locations. I don't want to do that, I only want to store it 
> once. This wold force the meta-data to be separate from the content.
> 
> Further still, it is also common, in some use cases (i.e. editing 
> Learning Objects) for a meta-data editor to be working on the meta-data 
> at the same time as the content author. Having the files separate is 
> useful in the absence of Version Control that does not require technical 
> knowledge (most of my users can barely use an email client).

I understand now.

> All that being said, Forrest could be made to support both a separate 
> file or embedded data (there are use cases where the simplistic solution 
> is the best one). The problem with this is that we will have two 
> locations for storing the same data - could be confusing for users.

IMHO the least we have to worry about is to confuse users. I have seen 
that if there is a simple and a more complete way, users would not get 
confused.

The only confusion would come out of using both methods at the same 
time, with clashing metadata values. Ouch!

I'll browse the web for 'rdf in html', 'xhtml metadata' etc to see how 
this is defined elsewhere. I want to try to reinvent the wheel the least 
possible.

>> Also, what is the relationship between skinconf.xml, 
>> pdf-output.conf.xml and metadata.xml?
> 
> The files in FORREST-INF are the defaults. So skinconf.xml contains the 
> site wide defaults for presentation elements that are core to Forrest. 
> pdf-output.conf.xml contains the defaults for PDF config (button on or 
> off, page size etc.) Metadata.xml contains site wide meta data 
> (generated-by, site title etc.)

Why are skinconf.xml and pdf-output.conf.xml separate?

> As with your RT you can then have versions of these files in subsites 
> that will override these defaults within the scope of the subsite.

Yup.

It seems we are in violent agreement on the concept, just need to iron 
out some smaller details.

-- 
Nicola Ken Barozzi                   nicolaken@apache.org
             - verba volant, scripta manent -
    (discussions get forgotten, just code remains)
---------------------------------------------------------------------


Mime
View raw message