incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jan i <>
Subject [PROPOSAL] [DISCUSSION] Corinthia for Incubation.
Date Mon, 24 Nov 2014 19:18:53 GMT
Hi all,

On behalf of the project team I  would like to propose Corinthia as an
Apache Incubator project. The complete proposal can be found at:

We will move the proposal to incubator wiki, if it gets accepted and all
project members have r/w access.

I would like to point out, that we really would like a second mentor, who
are familiar with all the "paperwork" and can help give a second opinion
from the "outside" since I am also involved in the project.

The project team hope for a good discussion and hope to get feedback.

Remark I have added all initial committers to this mail, since not all are
subscribed to general@.

Please find the complete text below:

On behalf of the Corinthia team
jan I.

====== COPY =======
## Proposal
The goal of Corinthia is to provide a responsive design editor (converter?
Or complete program? I suggest just using, “program”, a term used next
paragraph) as well as a toolkit that enacts a defined conversion between
different office document formats. Responsive design fits the layout as
needed, tablet or desktop. The editor is a lightweight editor - an
extension and not a replacement for the desktop editor.

Many office document programs claim to read/write to the ISO open standards
for office documents, OpenDocument Format (ODF) and Open Office XML
(OOXML), but do not document which parts are left unimplemented.
Furthermore, the standards have a large number of "implementation defined"
parts, making real-world congruence chancy. The Corinthia toolkit wants to
put this unacknowledged aspect into the open and provide "compliance
sheets" for document formats, as known from industry computer protocols.

Corinthia aims at generating a large set of test documents, which can be
used to verify the "compliance sheets". The code can work as test case for
other applications (or entities tendering for OOXML/ODF based systems) as

The base of Corinthia and its toolkit is the library DocFormats, which
converts between different office document file formats. Currently it
supports .docx (part of the OOXML specification), HTML, and LaTeX

The design  of DocFormats is based on on the idea of bidirectional
transformation (BDT), in which a specific document (the original file in
its source format) is converted into an abstract document (in the
destination format). A modified version of the abstract document can then
be used to update the specific document in a non-destructive manner,
keeping intact all parts of the file which are not supported in the
abstract format by modifying the original file rather than replacing it.

Descriptions of BDT can be found in:

Aaron Bohannon, J. Nathan Foster, Benjamin C. Pierce et. al. Boomerang:
Resourceful Lenses for String Data. Technical Report MS-CIS-07-15
Department of Computer and Information Science University of Pennsylvania.
November 2007. [boomerang.pdf](

Benjamin Pierce. Foundations for Bidirectional Programming. ICMT2009 -
International Conference on Model Transformation. June 2009.

The short term goal of the project is to have an easy-to-integrate library
that any application can use to embed support for a range of different file
formats, and use the parsing, serialisation, and conversion facilities for
various purposes. These include editors, batch conversion tools, web
publishing systems, document analysis tools, and content management
systems. By abstracting over different file formats and using HTML as a
common intermediate format, one can just code an application to that end,
and let DocFormats take care of conversion to other formats.

The medium term goal of the project is to have a series of end-user
applications (separate from the library itself), including an editor and
file conversion tool. These will serve as examples of how the libraries can
be used.

And ultimately to have a touch based UI for office documents.

It is also a goal to cooperate with other open source projects, in terms of
getting input from them as well as providing APIs for their use. Corinthia
is meant to be easy to understand and work with, making it more
approachable for a range of projects.

## Background
DocFormats has been shipping as part of UX Write on the iOS app store since
February 2013. From this perspective, it is a stable, mature library that
works for the most commonly-used features of .docx formats. As an open
source project, it is completely new, and from this perspective is very
much in its early stages. We are currently exploring the best way to
leverage the existing work that has been done to make it easier to
integrate in other projects, as well as support more file formats.

## Rationale
Apache's mission to produce software for the public good, fits with
Corinthia's idea of providing an editor and thoroughly documented
conversion of office documents, thereby hopefully show that implementations
can and should be documented especially where the standards offer options,
which will help to ensure interoperability.

We strongly believe the project has potential to grow by cooperating with
other projects and offers something mature projects cannot offer, a chance
to take advantage of new architectures and design philosophies, as well as
also to learn from, and not just reproduce, history.

We have found that Apache committers are loyal to Apache, and more likely
to take part in Corinthia.incubator than Corinthia.GitHub.

## Initial Goals
The initial and most important goal is to enlarge the community consisting
of developers, testers, and people who know the standards in depth.

Technically there are four goals:
- Cleanup core, to make it easy to add filters (format converters)
- Complete the ODF filter
- Add spreadsheeet/drawing formats to the ODF and OOXML filters
- Produce an editor based on JavaScript & HTML which can be embedded in
mobile apps or used in a Web browser

Our initial goals might not be big visions, but we prefer something
reachable, and then make bigger goals as we grow.

## Current Status
### Meritocracy
Some of the initial committers are already part of Apache, and those who
are not are getting used to working "the Apache way".

### Community
Our community could be larger, and committers from AOO and others have
shown interest in the project, but we have preferred to stay a stable, but
very active group until we are part of Incubator.

Apache/Incubator provides a lot of tools (e.g., mailing lists) and
community practices—The Apache Way--that enable community engagement and

### Core Developers
Peter Kelly,<br>
Jan Iversen, ASF member++<br>
Svante Schubert, ASF committer

Remark: Louis and Dorte do not develop, but help with non-coding tasks.
(Note: Louis is a committer with AOO. Does that matter? Also, I do know my
ODF, but not as much as Svante, of course.)

### Alignment
Corinthia has commonalities with AOO, but is not competing. AOO is a
desktop product and Corinthia is a lightweight editor and a developer
product (library).

Corinthia has a document API like POI, but the focus is different.
Corinthia targets a conversion library and an editor. POI is a handling API.

Sharing test documents with projects like AOO and POI seems to make a lot
of sense.

## Known Risks
The biggest risk Corinthia faces is failing to attract a larger community
(not only developers but also testers and documenters). A number of actions
has already been taken to minimize the risk:
- Contact to student projects (in particular Capstone)
- Talks at ApacheConEU

The project uses existing technologies, so there are no real technological

There is of course a risk that nobody wants to use the project, but the fun
building the community and project make this risk bearable.

### Orphaned products

### Inexperience with Open Source
All initial committers have worked several years with open source.

### Homogenous Developers
The initial committers are geographically distributed across the world.
Half of the initial developers are experienced Apache committers and all
have experience in working in distributed development communities.

The original source has already been partly refactored by other developers
to make sure knowledge is spread among multiple people.

### Reliance on Salaried Developers
No committers are being paid to participate.

However, it should be mentioned the Peter Kelly and Louis Suarez-Potts have
a company that has added a commercial editor for iPhone on top of the
library. The issue has been discussed.

### Relationships with Other Apache Products
Corinthia has/will have a relation to at least the following projects:
- **AOO**, core developers have been told on dev@ that for targeting mobile
platforms, a rewrite of AOO would be better than building on top of the
existing sources. It is our hope to have long and beneficial interaction
with AOO.
- **Httpd**, we would like to make a module that on the fly presents
odf/ooxml documents as pure html.
- **POI**, Corinthia library is similar to POI, but simpler, more generic,
and written in C. We hope to be able to share know-how as well as test

Corinthia is based on document standards which are used by numerous
high-profile projects. We would like to cooperate with the projects to
exchange knowledge.

## Documentation
The current documentation can be found at [github](<br>
The project is aware that this is work in progress and there is special
attention on this task.

## Initial Source
The initial source was closed-source developed solely by Peter Kelly. The
source has moved to GitHub as Open Source and all files changed to ASLv2 to
signal the grant.

Remark, the source was part of the UX Write product. The editor of UX Write
is currently not part of the grant.

## Source and Intellectual Property Submission Plan
Source code will be moved from the GitHub uxproductivity organisation space
to the Apache space as defined by incubator.<br>
Peter Kelly will grant the repo to ASF.

## External Dependencies
The current source includes 2 third party libraries (see
to which minor modifications have been made.
- **minizip**, a layer on top of zlib
- **w3c-tidy-html5**, a html5 manipulation lib.

Remark, currently the changes made to the original sources are
undocumented. The plan is to have the 1-1 sources in the repo (or have
users download it), and then apply a patch.

Furthermore, Corinthia depends on
- libxml2
- zlib

## Cryptography
Corinthia does not use cryptography nor are there plans to use it.<br>
In the medium term we would like to use digital signing of the convenience

## Required Resources
### Mailing lists
We would like to have: for general discussions<br> for private discussions

### Subversion Directory
We prefer not to use SVN.

### Git Repository
Our current git repository is [github](

We would like the repo to be moved to the github apache organisation under
the name "corinthia" or "corinthia.incubator"

### Issue Tracking
We are currently using github issues, which works fine, but would like to
change to Jira.

### Other Resources
- Wiki: We are currently using github wiki, we would like to move to an
Apache supported wiki (preferable mediawiki), before our documentations
gets too complicated to move.

- Buildbot: We would like to be able to build/test on OSX, Windows and

- Web: We would like, if possible, to have the home page (just raw html please).

- Blog: We would like, to have a blog, preferable wordpress.

## Initial Committers
Our initial list of committers is not as long as we would have liked it to
be, but have not pushed for a larger community before becoming part of ASF.

Dorte Fjalland, (ICLA mailed)<br>
Jan Iversen, (ASF member++)<br>
Louis Suárez-Potts, (AOO PMC)<br>
Peter Kelly, (ICLA mailed)<br>
Svante Schubert, (AOO committer)

## Affiliations

## Sponsors
### Champion
Jan Iversen

### Nominated Mentors
Jan Iversen.

Since jan is involved in the project, it would be beneficial to have at
least a second mentor with a "outside" view, who can help focus on the
administrative logistics.

### Sponsoring Entity
Incubator IPMC.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message