forrest-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ross Gardler <rgard...@apache.org>
Subject Re: [RT] Per document skinconf (was Re: coloring table cells [from the user list])
Date Thu, 24 Feb 2005 11:04:34 GMT
Nicola Ken Barozzi wrote:
> Ross Gardler wrote:
> 
>> Nicola Ken Barozzi wrote:
>>
>>> Ross Gardler wrote:
>>> ...
>>>

> 
>> Handling Meta Data and Presentation Data
>> ----------------------------------------
>>
>> Meta-data is generated on a per file basis, with only a minority of 
>> information coming from a central location. On the other hand, 
>> presentation data has the majority of information coming from a 
>> central location with only a minority being defined on a per file 
>> granularity.
> 
> 
> Not necessarily. If a Forrest provider makes a site for multiple people, 
> each would want it's own presentation metadata...

Exactly, if I reuse your content I will want your meta-data but not your 
presentation data. Therefore we need to keep it separate.

> 
>> This means that to be "able to define metadata nearer to the file that 
>> needs it" we need to store the different types of data in different 
>> locations. Meta-data needs to be closer to the file with an 
>> opportunity to centrally define defaults (as per your RT) and 
>> presentation data needs to be centralised with an opportunity to 
>> override locally to the file (as per my RT).
> 
> 
> Again, since there are both local and central locations, where is the 
> divide?

You are absolutely correct. I spotted this mistake after posting, but 
thought I'd wait for your comments. Although we need two files (see 
above) there need only be one access mechanism within Forrest.

>> Directory Structure
>> ===================
>>
>> How does this affect the proposed directory structure in your RT?
>>
>> You propose:
>>
>> my-project/
>>          forrest.xml
>>          documentation/
>>               status.xml
>>                ** all other content
>>               FORREST-INF/
>>                 skinconf.xml
>>                 site.xml
>>                 tabs.xml
>>
>> This would become (arrows indicate changed/added lines):
>>
> (inserted correct version)
> ...
> 
>> my-project/
>>         forrest.xml
>>         documentation/
>>              status.xml
>> -->          content.xml
>> -->          content.meta.xml
>>               ** all other content
>>              FORREST-INF/
>>                skinconf.xml
>> -->            pdf-output.conf.xml
>>                site.xml
>>                tabs.xml
>> -->            metadata.xml
>> Note that with the introduction of a meta-data plugin all the 
>> meta-data files would be optional since the plugin would provide 
>> defaults. Similarly the plugin conf.xml files are optional. So 
>> although this looks like adding many new files, they are only present 
>> if actually needed by a user.
> 
> 
> I don't understand what the files you add contain.

Sorry, I'll try and explain

> I suppose that content.xml is a normal xml file, and content.meta.xml 
> it's metadata.

Yes

> Why not insert the metadata in the file itself like I 
> proposed? I prefer to keep faith to the 1 file -> one output rule.

I guess our different views here are because of different use cases. You 
seem to be assuming that you only ever want the meta-data and the 
content together. That is not the case for me.

Meta-data is often processed independently of actual data. For example, 
a meta-data harvester is not interested in content. Of course, this 
would still be possible if it were all in the same file, but performance 
would suffer.

In some use cases this performance bottleneck will become an issue, for 
example I have Learning Objects that consist of over 600 pages, each 
with an average of around 4000 characters. Each page has an additional 
1000 characters of meta data (not including XML elements). The nature of 
XML is that you need to process the whole file even if you only want one 
element. This means we are processing 5000 characters of data instead of 
1000, when harvesting 600 pages that is 300,000 characters of data that 
we don't want (and we are still ignoring the XML elements).

Furthermore, meta-data tends to be highly structured and would therefore 
benefit from being stored in a relational database rather than an XML 
one. Meta-data is often the subject of complex queries and the speed of 
a relational database is useful here. If the meta-data is stored inside 
the content file then this can only be done by duplicating the data 
across two locations. I don't want to do that, I only want to store it 
once. This wold force the meta-data to be separate from the content.

Further still, it is also common, in some use cases (i.e. editing 
Learning Objects) for a meta-data editor to be working on the meta-data 
at the same time as the content author. Having the files separate is 
useful in the absence of Version Control that does not require technical 
knowledge (most of my users can barely use an email client).

All that being said, Forrest could be made to support both a separate 
file or embedded data (there are use cases where the simplistic solution 
is the best one). The problem with this is that we will have two 
locations for storing the same data - could be confusing for users.

> Also, what is the relationship between skinconf.xml, pdf-output.conf.xml 
> and metadata.xml?

The files in FORREST-INF are the defaults. So skinconf.xml contains the 
site wide defaults for presentation elements that are core to Forrest. 
pdf-output.conf.xml contains the defaults for PDF config (button on or 
off, page size etc.) Metadata.xml contains site wide meta data 
(generated-by, site title etc.)

As with your RT you can then have versions of these files in subsites 
that will override these defaults within the scope of the subsite.

Ross
> 
> TIA
> 


Mime
View raw message