incubator-odf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Svante Schubert <svante.schub...@gmail.com>
Subject Re: [GSoC 2012] Add ODF 1.2 RDF metadata support to ODF Toolkit (Tao Lin)
Date Sun, 18 Mar 2012 15:15:07 GMT
Hello Tao Lin,

Very pleased to meet you, you made an impressive research and raised
good questions.

Please find my answers below..

On 18.03.2012 12:02, Tao Lin wrote:
> Dear Sir/Madam,
>
> My name is Tao Lin, a third year undergraduate student from China. I'm
> very interested in GSoC 2012 project: Add ODF 1.2 RDF metadata support
> to ODF Toolkit. I have good knowledge of semantic technologies, such
> as RDF, OWL, SPARQL. I'm also familiar with the mainstream Java based
> RDF/OWL processing tools like Jena, Sesame, AllegroGraph. I have
> strong Java coding skills with of good knowledge of the software
> design patterns. Last year, I was accepted by GSoC 2011 and
> successfully completed a project for LanguageTool [6]. This summer,
> I'd like to contribute to ODF community in this "RDF metadata support"
> project, because I find my abilities match the project requirements
> very well.
>
> I just studied the provided documents [1] [2], and the OWL file [5]. I
> also found some slides [3] and a document [4] demostrating some
> examples. However, not all of the documents are up-to-date: [4] is
> composed in 2007, and [3] is published last year. I can understand
> most of the specification, but I''m quite confused with some parts
> because of the inconsistency among the documents. Could you help me
> with the following questions?
2007 the specification was still in change (or under construction),
therefore the differences and confusion - I should have stated it more
obviously in my presentation, you referenced as [3].
There was a key event happing later in October 2008, when I gave a
presentation to the W3C Semantic Interest Group at their TPAC
<http://www.w3.org/2008/10/TPAC/> - http://www.w3.org/2008/10/TPAC/ to
review the metadata work.
There was a major change afterwards, earlier there had been a mapping in
the manifest.rdf between an ODF content & ID and an URN being assigned
to it in the manifest.rdf for the RDF graph.
This was initiated by some RDFa expert within the OASIS sub-committee
stating that identification (identifying) would be not similar than
localization (finding).
The W3C group, especially Sir Tim Berners Lee, gave me feed-back that
this is wrong. That URN would have been an ill invention and that
identification & localization should be used as the same, otherwise the
Internet would not have worked. Since, than we directly refer with
relative URLs from the manifest.rdf to metadata in the content/package.
Sorry, for the confusion.

>
> (1). As is showed in [2], RDF Metadata are of two types:
> 4.2.1 In Content Metadata (RDFa)
> 4.2.2 manifest.rdf
> Are both of them within the scope of this GSoC project? Or just the second one?
Both (or precisely all possible metadata), but there should be a generic
handling possible. For instance, RDFa would be accessed via ODF Toolkit
API. Likely to be added to ODFDOM, perhaps accessed even by generated
functionality (more about generation later).
>
> (2) In page 12 of [3], is the old OWL Class"pkg:Package" replaced by
> "pkg:Document"? I can not find "pkg:Package" in [1], [2] or [5].
The spec - as usually - is correct. The owl class pkg:Package was
dropped for now to avoid boilerplate.
As some basic information on ODF 1.2 specification:
Part 3 of the spec is the package format and might be reused by other
formats (unfortunately EPUB missed to reuse the package standard in
their latest spec and reinvented the wheel).
Part 1 of the spec is the ODF XML format and is based on part 3 (yes -
the numbering is confusing sometes, I usually would expect "1" to be the
ground layer, but the number was chosen obviously by a different
criteria and is finally not important).

>From the package view a document is nothing more than a directory with a
mime type in the package - identified by information in the
/META-INF/manifest.xml file.
There is always a root document in an ODF package, but there might be
sub documents as well - like embedding a chart document within a
spreadsheet, see
http://docs.oasis-open.org/office/v1.2/os/OpenDocument-v1.2-os-part3.html#General

Therefore to answer a later asked question the difference between the
pkg: and odf: prefix is the layer. A pkg:file can be any arbitrary file
within an ODF package, while odf:file can be only a file defined in part
1 (e.g. content.xml, stylesl.xml, etc.). BTW you forgot one reference to
an OWL (the one of part 3), I have added it as [5a]
>
> (3) In page 15 of [3], it uses "pkg:idref". But in page 7 of [4], it
> shows "odf:idref". I can not find the definition in [1], [2], or [5].
> Which one is correct?
Neither, it was one thing that was corrected by the W3C comment (see
intro text in the beginning)
>
> (4) For "In Content Metadata", besides the supported 5 elements showed
> in page 16 of [3], the additional 6th one is
> "<table:covered-table-cell>" according to [2]. Is that true?
The spec is like a blue-print and outrules everything. Especially as the
spec is from 2011 it outrules my presentation from 2007.
Nevertheless I am happy if you are question the spec as well, because
even the spec is created by humans and errors are still possible.

ODF Toolkit allows the generation of sources by relying on the (Sun)
Multi-Schema Validator to parse the XML and the Apache Velocity template
engine to have text templates that allow access to a Java context, see
http://svn.apache.org/viewvc/incubator/odf/trunk/generator/
either generate the latest JavaDoc, or download an old one
https://oss.sonatype.org/content/groups/public/org/odftoolkit/schema2template/0.8.7/schema2template-0.8.7-javadoc.jar
and call "jar -xvf schema2template-0.8.7-javadoc.jar" and the index.html
will provide you a good detailed overview over the generator.
>
> (5) As is showed in page 17 , for "In Content Metadata", we don't use
> manifest.rdf to map the "xml:id" to RDF IRI, do we? I think, there are
> no "In Content Metadata" information in manifest.rdf. Is that true?
Manifest.rdf is an RDF file as the suffix indicates. Its reason is to
have a single point of information about metadata on the ODF package as
described in
http://docs.oasis-open.org/office/v1.2/os/OpenDocument-v1.2-os-part3.html#Metadata
There is no RDFa in the manifest.rdf, if this was your question.

>
> (6) In page 18 of [3], we have <text:meta-field>. What're the
> differences of <text-:meta-field> and <text:meta> (in page 16 of [3])?
> Are they visible or invisible to users?
Both include text with metadata.
text:meta is similar to a text:span with metadata.
text:meta-field is like text that was generated by metadata. Think of
citations that are being generated in a certain way by your metadata
based citation plugin. For instance, is regenerated whenever you choose
a new citation layout required by a different magazine you like to sent
the document to, see
http://docs.oasis-open.org/office/v1.2/os/OpenDocument-v1.2-os-part1.html#element-text_meta-field

>
> (7) How to use "odf:prefix" and "odf:suffix" for <text:meta-field>?
> Can you show me some examples?
Added the latest example document to the reference list as [4b] -
http://www.oasis-open.org/committees/document.php?document_id=34796&wg_abbrev=office-metadata
Unfortunately there is no example about it, anyway it was required to
define the pre- and suffix of a field that was being generated. Use
cases state there is very often such pre- and suffix in a field.
Never mind, it is not upto you to create a text-field functionality as a
citation application. You only have to add the access (read, write,
deletion) of these field in the ODF Toolkit.

>
> (8) In [5], we have "odf:Element" and "pkg:Element". What're the
> differences? I'm also confused about the namespaces of "odf" and
> "pkg". Sometime we use "odf" (e.g. odf:ContentFile), while others are
> "pkg" (e.g. pkg:MetadataFile). Why?
It is because of modularity. As described already at #2, pkg is for
every application that is reusing the ODF package, like pkg:Element is
an XML element within an XML file in an ODF package (spec part 3).
odf:Element instead is as well used within an ODF package, but it also
uses ODF XML (spec part 1).
> Looking forward to hearing from you!
>
> [1] http://docs.oasis-open.org/office/v1.2/os/OpenDocument-v1.2-os-part3.html#Metadata_Manifest_Files
> [2] http://docs.oasis-open.org/office/v1.2/os/OpenDocument-v1.2-os-part1.html#a4Metadata
> [3] http://www.slideshare.net/jza/the-openofficeorg-odf-toolkit-project
> [4] http://www.oasis-open.org/committees/download.php/25054/07-08-22-MetaData-Examples.odt
[4b]
http://www.oasis-open.org/committees/document.php?document_id=34796&wg_abbrev=office-metadata
[5a]
http://docs.oasis-open.org/office/v1.2/os/OpenDocument-v1.2-os-package-metadata.owl

> [5] http://docs.oasis-open.org/office/v1.2/os/OpenDocument-v1.2-os-metadata.owl
> [6] http://www.languagetool.org/gsoc2011/
>
> Yours faithfully,
> Tao Lin
If there is any further question or I did not explain something clear
enough, do not hesitate to ask again.

Best regards,
Svante

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message