incubator-odf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Devin Han <devin...@apache.org>
Subject Re: Blog post on our release
Date Thu, 02 Feb 2012 08:27:19 GMT
2012/2/1 Rob Weir <robweir@apache.org>

> On Tue, Jan 31, 2012 at 3:49 AM, Oliver Rau <olira@apache.org> wrote:
> > Heise Germany finally posted the news as well:
> >
> >
> http://www.heise.de/open/meldung/Erstes-Apache-Release-des-ODF-Toolkit-1424915.html
> >
>
>
> Another blog post:
>
> http://fileformats.wordpress.com/2012/01/30/odf-toolkit/
>

His project JHOVE sounds interesting:
*JHOVE provides functions to perform format-specific identification,
validation, and characterization of digital objects.

    Format identification is the process of determining the format to which
a digital object conforms; in other words, it answers the question: "I have
a digital object; what format is it?"

    Format validation is the process of determining the level of compliance
of a digital object to the specification for its purported format, e.g.: "I
have an object purportedly of format F; is it?"

    Format validation conformance is determined at two levels:
well-formedness and validity.
        A digital object is well-formed if it meets the purely syntactic
requirements for its format.
        An object is valid if it is well-formed and it meets additional
semantic-level requirements.

    For example, a TIFF object is well-formed if it starts with an 8 byte
header followed by a sequence of Image File Directories (IFDs), each
composed of a 2 byte entry count and a series of 8 byte tagged entries. The
object is valid if it meets certain additional semantic-level rules, such
as that an RGB file must have at least three sample values per pixel.

    Format characterization is the process of determining the
format-specific significant properties of an object of a given format,
e.g.: "I have an object of format F; what are its salient properties?" *


This is a collaboration project of JSTOR <http://www.jstor.org/> and
the Harvard
University Library <http://hul.harvard.edu/> . I think we can help them on
the ODF module, that would be a good user case for our toolkit. Just like
we want to do for Tika.

BTW: the link to ODF Toolkit website on this article doesn't work...


> -Rob
>
>
> > Regards
> >
> > Oliver
> >
> > On Fri, Jan 27, 2012 at 9:44 PM, Rob Weir <robweir@apache.org> wrote:
> >> Some good coverage on Heise Online:>>
> http://www.h-online.com/open/news/item/ODF-Toolkit-gets-first-Apache-release-1423805.html
> >
> > On Fri, Jan 27, 2012 at 9:44 PM, Rob Weir <robweir@apache.org> wrote:
> >> On Thu, Jan 26, 2012 at 1:16 PM, Rob Weir <robweir@apache.org> wrote:
> >>> FYI.  I did a short blog post on our recent ODF Toolkit release:
> >>>
> >>> http://www.robweir.com/blog/2012/01/apache-odf-toolkit-release.html
> >>>
> >>> Hopefully others can spread the word on their own, via Twitter,
> >>> Google+ or whatever.
> >>>
> >>
> >> Some good coverage on Heise Online:
> >>
> >>
> http://www.h-online.com/open/news/item/ODF-Toolkit-gets-first-Apache-release-1423805.html
> >>
> >>
> >>> -Rob
>



-- 
-Devin

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message