incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ross Gardler (MS OPEN TECH)" <Ross.Gard...@microsoft.com>
Subject RE: [PROPOSAL] [DISCUSSION] Corinthia for Incubation.
Date Mon, 24 Nov 2014 21:14:42 GMT
There are some passing similarities with Forrest here. I'm not suggesting any changes to the
proposal, just flagging that the intermediate format approach described here is the approach
Forrest takes. The focus for Forrest was not on document format conversion but there is a
lot of experience there in using intermediate formats.

Microsoft Open Technologies, Inc.
A subsidiary of Microsoft Corporation

-----Original Message-----
From: jan i [mailto:jani@apache.org] 
Sent: Monday, November 24, 2014 12:37 PM
To: jan i
Cc: general@incubator.apache.org
Subject: Re: [PROPOSAL] [DISCUSSION] Corinthia for Incubation.

Hi.

Sorry, somebody hit me with the big GIT hammer !! I did something with git that should not
be done.

The correct text is:
================= COPY ==============
## Proposal
The goal of Corinthia is to provide a responsive design editor as well as a toolkit that enacts
a defined conversion between different office document formats. Responsive design fits the
layout as needed, tablet or desktop.
The editor is a lightweight editor - an extension and not a replacement for the desktop editor.

Many office document programs claim to read/write to the ISO open standards for office documents,
OpenDocument Format (ODF) and Open Office XML (OOXML), but do not document which parts are
left unimplemented.
Furthermore, the standards have a large number of "implementation defined"
parts, making real-world congruence chancy. The Corinthia toolkit wants to put this unacknowledged
aspect into the open and provide "compliance sheets" for document formats, as known from industry
computer protocols.

Corinthia aims at generating a large set of test documents, which can be used to verify the
"compliance sheets". The code can work as test case for other applications (or entities tendering
for OOXML/ODF based systems) as well.

The base of Corinthia and its toolkit is the library DocFormats, which converts between different
office document file formats. Currently it supports .docx (part of the OOXML specification),
HTML, and LaTeX (export-only).

The design  of DocFormats is based on on the idea of bidirectional transformation (BDT), in
which a specific document (the original file in its source format) is converted into an abstract
document (in the destination format). A modified version of the abstract document can then
be used to update the specific document in a non-destructive manner, keeping intact all parts
of the file which are not supported in the abstract format by modifying the original file
rather than replacing it.

Descriptions of BDT can be found in:

Aaron Bohannon, J. Nathan Foster, Benjamin C. Pierce et. al. Boomerang:
Resourceful Lenses for String Data. Technical Report MS-CIS-07-15 Department of Computer and
Information Science University of Pennsylvania.
November 2007. [boomerang.pdf](
http://www.cis.upenn.edu/~bcpierce/papers/boomerang.pdf)

Benjamin Pierce. Foundations for Bidirectional Programming. ICMT2009 - International Conference
on Model Transformation. June 2009.
[icmt-2009-slides.pdf](
http://www.cis.upenn.edu/~bcpierce/papers/icmt-2009-slides.pdf)

The short term goal of the project is to have an easy-to-integrate library that any application
can use to embed support for a range of different file formats, and use the parsing, serialisation,
and conversion facilities for various purposes. These include editors, batch conversion tools,
web publishing systems, document analysis tools, and content management systems. By abstracting
over different file formats and using HTML as a common intermediate format, one can just code
an application to that end, and let DocFormats take care of conversion to other formats.

The medium term goal of the project is to have a series of end-user applications (separate
from the library itself), including an editor and file conversion tool. These will serve as
examples of how the libraries can be used.

And ultimately to have a touch based UI for office documents.

It is also a goal to cooperate with other open source projects, in terms of getting input
from them as well as providing APIs for their use. Corinthia is meant to be easy to understand
and work with, making it more approachable for a range of projects.

## Background
DocFormats has been shipping as part of UX Write on the iOS app store since February 2013.
From this perspective, it is a stable, mature library that works for the most commonly-used
features of .docx formats. As an open source project, it is completely new, and from this
perspective is very much in its early stages. We are currently exploring the best way to leverage
the existing work that has been done to make it easier to integrate in other projects, as
well as support more file formats.

## Rationale
Apache's mission to produce software for the public good, fits with Corinthia's idea of providing
an editor and thoroughly documented conversion of office documents, thereby hopefully show
that implementations can and should be documented especially where the standards offer options,
which will help to ensure interoperability.

We strongly believe the project has potential to grow by cooperating with other projects and
offers something mature projects cannot offer, a chance to take advantage of new architectures
and design philosophies, as well as also to learn from, and not just reproduce, history.

We have found that Apache committers are loyal to Apache, and more likely to take part in
Corinthia.incubator than Corinthia.GitHub.


## Initial Goals
The initial and most important goal is to enlarge the community consisting of developers,
testers, and people who know the standards in depth.

Technically there are four goals:
- Cleanup core, to make it easy to add filters (format converters)
- Complete the ODF filter
- Add spreadsheeet/drawing formats to the ODF and OOXML filters
- Produce an editor based on JavaScript & HTML which can be embedded in mobile apps or
used in a Web browser

Our initial goals might not be big visions, but we prefer something reachable, and then make
bigger goals as we grow.

## Current Status
### Meritocracy
Some of the initial committers are already part of Apache, and those who are not are getting
used to working "the Apache way".

### Community
Our community could be larger, and committers from AOO and others have shown interest in the
project, but we have preferred to stay a stable, but very active group until we are part of
Incubator.

Apache/Incubator provides a lot of tools (e.g., mailing lists) and community practices—The
Apache Way--that enable community engagement and growth.

### Core Developers
Peter Kelly, peter@uxproductivity.com<br> Jan Iversen, ASF member++<br> Svante
Schubert, ASF committer

Remark: Louis and Dorte do not develop, but help with non-coding tasks.

### Alignment
Corinthia has commonalities with AOO, but is not competing. AOO is a desktop product and integrated
suite and Corinthia is a lightweight editor and a developer product (library).

Corinthia has a document API like POI, but the focus is different.
Corinthia targets a conversion library and an editor. POI is a handling API.

Sharing test documents with projects like AOO and POI seems to make a lot of sense.

## Known Risks
The biggest risk Corinthia faces is failing to attract a larger community (not only developers
but also testers and documenters). A number of actions has already been taken to minimize
the risk:
- Contact to student projects (in particular Capstone)
- Talks at ApacheConEU

The project uses existing technologies, so there are no real technological risks.

There is of course a risk that nobody wants to use the project, but the fun building the community
and project make this risk bearable.

### Orphaned products
None

### Inexperience with Open Source
All initial committers have worked several years with open source.

### Homogenous Developers
The initial committers are geographically distributed across the world.
Half of the initial developers are experienced Apache committers and all have experience in
working in distributed development communities.

The original source has already been partly refactored by other developers to make sure knowledge
is spread among multiple people.

### Reliance on Salaried Developers
No committers are being paid to participate.

However, it should be mentioned the Peter Kelly and Louis Suarez-Potts have a company that
has added a commercial editor for iPhone on top of the library. The issue has been discussed.

### Relationships with Other Apache Products Corinthia has/will have a relation to at least
the following projects:
- **AOO**, core developers have been told on dev@ that for targeting mobile platforms, a rewrite
of AOO would be better than building on top of the existing sources. It is our hope to have
long and beneficial interaction with AOO.
- **Httpd**, we would like to make a module that on the fly presents odf/ooxml documents as
pure html.
- **POI**, Corinthia library is similar to POI, but simpler, more generic, and written in
C. We hope to be able to share know-how as well as test cases.

Corinthia is based on document standards which are used by numerous high-profile projects.
We would like to cooperate with the projects to exchange knowledge.

## Documentation
The current documentation can be found at [github]( https://github.com/uxproductivity/Corinthia/wiki)<br>
The project is aware that this is work in progress and there is special attention on this
task.

## Initial Source
The initial source was closed-source developed solely by Peter Kelly. The source has moved
to GitHub as Open Source and all files changed to ASLv2 to signal the grant.

Remark, the source was part of the UX Write product. The editor of UX Write is currently not
part of the grant.

## Source and Intellectual Property Submission Plan Source code will be moved from the GitHub
uxproductivity organisation space to the Apache space as defined by incubator.<br> Peter
Kelly will grant the repo to ASF.

## External Dependencies
The current source includes 2 third party libraries (see [DocFormats/3rdparty](
https://github.com/uxproductivity/DocFormats/tree/master/DocFormats/3rdparty))
to which minor modifications have been made.
- **minizip**, a layer on top of zlib
- **w3c-tidy-html5**, a html5 manipulation lib.

Remark, currently the changes made to the original sources are undocumented. The plan is to
have the 1-1 sources in the repo (or have users download it), and then apply a patch.

Furthermore, Corinthia depends on
- libxml2
- zlib
- SDL

## Cryptography
Corinthia does not use cryptography nor are there plans to use it.<br> In the medium
term we would like to use digital signing of the convenience binaries.

## Required Resources
### Mailing lists
We would like to have:

  dev@corinthia.incubator.apache.org for general discussions<br>
  private@corinthia.incubator.apache.org for private discussions

### Subversion Directory
We prefer not to use SVN.

### Git Repository
Our current git repository is [github](
https://github.com/uxproductivity/Corinthia).

We would like the repo to be moved to the github apache organisation under the name "corinthia"
or "corinthia.incubator"

### Issue Tracking
We are currently using github issues, which works fine, but would like to change to Jira.

### Other Resources
- Wiki: We are currently using github wiki, we would like to move to an Apache supported wiki
(preferable mediawiki), before our documentations gets too complicated to move.

- Buildbot: We would like to be able to build/test on OSX, Windows and Ubuntu.

- Web: We would like, if possible, to have the home page corinthia.incubator.apache.org (just
raw html please).

- Blog: We would like, to have a blog, preferable wordpress.


## Initial Committers
Our initial list of committers is not as long as we would have liked it to be, but have not
pushed for a larger community before becoming part of ASF.

Dorte Fjalland, dorte@casacondor.com (ICLA mailed)<br> Jan Iversen, jani@apache.org
(ASF member++)<br> Louis Suárez-Potts, louis@apache.org (AOO PMC)<br> Peter Kelly,
peter@uxproductivity.com (ICLA mailed)<br> Svante Schubert, svante.schubert@gmail.com
(AOO committer)

## Affiliations
None

## Sponsors
### Champion
Jan Iversen

### Nominated Mentors
Jan Iversen.

Since jan is involved in the project, it would be beneficial to have at least a second mentor
with a "outside" view, who can help focus on the administrative logistics.

### Sponsoring Entity
Incubator IPMC.
=========================

Sorry about that, please blame me, not the project.

rgds
jan i.


On 24 November 2014 at 20:18, jan i <jani@apache.org> wrote:

> Hi all,
>
> On behalf of the project team I  would like to propose Corinthia as an 
> Apache Incubator project. The complete proposal can be found at:
> https://github.com/uxproductivity/Corinthia/wiki/Incubator-proposal
>
> We will move the proposal to incubator wiki, if it gets accepted and 
> all project members have r/w access.
>
> I would like to point out, that we really would like a second mentor, 
> who are familiar with all the "paperwork" and can help give a second 
> opinion from the "outside" since I am also involved in the project.
>
> The project team hope for a good discussion and hope to get feedback.
>
> Remark I have added all initial committers to this mail, since not all 
> are subscribed to general@.
>
> Please find the complete text below:
>
> On behalf of the Corinthia team
> jan I.
>
> ====== COPY =======
> ## Proposal
> The goal of Corinthia is to provide a responsive design editor (converter?
> Or complete program? I suggest just using, “program”, a term used next
> paragraph) as well as a toolkit that enacts a defined conversion 
> between different office document formats. Responsive design fits the 
> layout as needed, tablet or desktop. The editor is a lightweight 
> editor - an extension and not a replacement for the desktop editor.
>
> Many office document programs claim to read/write to the ISO open 
> standards for office documents, OpenDocument Format (ODF) and Open 
> Office XML (OOXML), but do not document which parts are left unimplemented.
> Furthermore, the standards have a large number of "implementation defined"
> parts, making real-world congruence chancy. The Corinthia toolkit 
> wants to put this unacknowledged aspect into the open and provide 
> "compliance sheets" for document formats, as known from industry computer protocols.
>
> Corinthia aims at generating a large set of test documents, which can 
> be used to verify the "compliance sheets". The code can work as test 
> case for other applications (or entities tendering for OOXML/ODF based 
> systems) as well.
>
> The base of Corinthia and its toolkit is the library DocFormats, which 
> converts between different office document file formats. Currently it 
> supports .docx (part of the OOXML specification), HTML, and LaTeX 
> (export-only).
>
> The design  of DocFormats is based on on the idea of bidirectional 
> transformation (BDT), in which a specific document (the original file 
> in its source format) is converted into an abstract document (in the 
> destination format). A modified version of the abstract document can 
> then be used to update the specific document in a non-destructive 
> manner, keeping intact all parts of the file which are not supported 
> in the abstract format by modifying the original file rather than replacing it.
>
> Descriptions of BDT can be found in:
>
> Aaron Bohannon, J. Nathan Foster, Benjamin C. Pierce et. al. Boomerang:
> Resourceful Lenses for String Data. Technical Report MS-CIS-07-15 
> Department of Computer and Information Science University of Pennsylvania.
> November 2007. [boomerang.pdf](
> http://www.cis.upenn.edu/~bcpierce/papers/boomerang.pdf)
>
> Benjamin Pierce. Foundations for Bidirectional Programming. ICMT2009 - 
> International Conference on Model Transformation. June 2009.
> [icmt-2009-slides.pdf](
> http://www.cis.upenn.edu/~bcpierce/papers/icmt-2009-slides.pdf)
>
> The short term goal of the project is to have an easy-to-integrate 
> library that any application can use to embed support for a range of 
> different file formats, and use the parsing, serialisation, and 
> conversion facilities for various purposes. These include editors, 
> batch conversion tools, web publishing systems, document analysis 
> tools, and content management systems. By abstracting over different 
> file formats and using HTML as a common intermediate format, one can 
> just code an application to that end, and let DocFormats take care of conversion to other
formats.
>
> The medium term goal of the project is to have a series of end-user 
> applications (separate from the library itself), including an editor 
> and file conversion tool. These will serve as examples of how the 
> libraries can be used.
>
> And ultimately to have a touch based UI for office documents.
>
> It is also a goal to cooperate with other open source projects, in 
> terms of getting input from them as well as providing APIs for their use.
> Corinthia is meant to be easy to understand and work with, making it 
> more approachable for a range of projects.
>
> ## Background
> DocFormats has been shipping as part of UX Write on the iOS app store 
> since February 2013. From this perspective, it is a stable, mature 
> library that works for the most commonly-used features of .docx 
> formats. As an open source project, it is completely new, and from 
> this perspective is very much in its early stages. We are currently 
> exploring the best way to leverage the existing work that has been 
> done to make it easier to integrate in other projects, as well as support more file formats.
>
> ## Rationale
> Apache's mission to produce software for the public good, fits with 
> Corinthia's idea of providing an editor and thoroughly documented 
> conversion of office documents, thereby hopefully show that 
> implementations can and should be documented especially where the 
> standards offer options, which will help to ensure interoperability.
>
> We strongly believe the project has potential to grow by cooperating 
> with other projects and offers something mature projects cannot offer, 
> a chance to take advantage of new architectures and design 
> philosophies, as well as also to learn from, and not just reproduce, history.
>
> We have found that Apache committers are loyal to Apache, and more 
> likely to take part in Corinthia.incubator than Corinthia.GitHub.
>
>
> ## Initial Goals
> The initial and most important goal is to enlarge the community 
> consisting of developers, testers, and people who know the standards in depth.
>
> Technically there are four goals:
> - Cleanup core, to make it easy to add filters (format converters)
> - Complete the ODF filter
> - Add spreadsheeet/drawing formats to the ODF and OOXML filters
> - Produce an editor based on JavaScript & HTML which can be embedded 
> in mobile apps or used in a Web browser
>
> Our initial goals might not be big visions, but we prefer something 
> reachable, and then make bigger goals as we grow.
>
> ## Current Status
> ### Meritocracy
> Some of the initial committers are already part of Apache, and those 
> who are not are getting used to working "the Apache way".
>
> ### Community
> Our community could be larger, and committers from AOO and others have 
> shown interest in the project, but we have preferred to stay a stable, 
> but very active group until we are part of Incubator.
>
> Apache/Incubator provides a lot of tools (e.g., mailing lists) and 
> community practices—The Apache Way--that enable community engagement 
> and growth.
>
> ### Core Developers
> Peter Kelly, peter@uxproductivity.com<br> Jan Iversen, ASF 
> member++<br> Svante Schubert, ASF committer
>
> Remark: Louis and Dorte do not develop, but help with non-coding tasks.
> (Note: Louis is a committer with AOO. Does that matter? Also, I do 
> know my ODF, but not as much as Svante, of course.)
>
> ### Alignment
> Corinthia has commonalities with AOO, but is not competing. AOO is a 
> desktop product and Corinthia is a lightweight editor and a developer 
> product (library).
>
> Corinthia has a document API like POI, but the focus is different.
> Corinthia targets a conversion library and an editor. POI is a handling API.
>
> Sharing test documents with projects like AOO and POI seems to make a 
> lot of sense.
>
> ## Known Risks
> The biggest risk Corinthia faces is failing to attract a larger 
> community (not only developers but also testers and documenters). A 
> number of actions has already been taken to minimize the risk:
> - Contact to student projects (in particular Capstone)
> - Talks at ApacheConEU
>
> The project uses existing technologies, so there are no real 
> technological risks.
>
> There is of course a risk that nobody wants to use the project, but 
> the fun building the community and project make this risk bearable.
>
> ### Orphaned products
> None
>
> ### Inexperience with Open Source
> All initial committers have worked several years with open source.
>
> ### Homogenous Developers
> The initial committers are geographically distributed across the world.
> Half of the initial developers are experienced Apache committers and 
> all have experience in working in distributed development communities.
>
> The original source has already been partly refactored by other 
> developers to make sure knowledge is spread among multiple people.
>
> ### Reliance on Salaried Developers
> No committers are being paid to participate.
>
> However, it should be mentioned the Peter Kelly and Louis Suarez-Potts 
> have a company that has added a commercial editor for iPhone on top of 
> the library. The issue has been discussed.
>
> ### Relationships with Other Apache Products Corinthia has/will have a 
> relation to at least the following projects:
> - **AOO**, core developers have been told on dev@ that for targeting 
> mobile platforms, a rewrite of AOO would be better than building on 
> top of the existing sources. It is our hope to have long and 
> beneficial interaction with AOO.
> - **Httpd**, we would like to make a module that on the fly presents 
> odf/ooxml documents as pure html.
> - **POI**, Corinthia library is similar to POI, but simpler, more 
> generic, and written in C. We hope to be able to share know-how as 
> well as test cases.
>
> Corinthia is based on document standards which are used by numerous 
> high-profile projects. We would like to cooperate with the projects to 
> exchange knowledge.
>
> ## Documentation
> The current documentation can be found at [github]( 
> https://github.com/uxproductivity/Corinthia/wiki)<br>
> The project is aware that this is work in progress and there is 
> special attention on this task.
>
> ## Initial Source
> The initial source was closed-source developed solely by Peter Kelly. 
> The source has moved to GitHub as Open Source and all files changed to 
> ASLv2 to signal the grant.
>
> Remark, the source was part of the UX Write product. The editor of UX 
> Write is currently not part of the grant.
>
> ## Source and Intellectual Property Submission Plan Source code will 
> be moved from the GitHub uxproductivity organisation space to the 
> Apache space as defined by incubator.<br> Peter Kelly will grant the 
> repo to ASF.
>
> ## External Dependencies
> The current source includes 2 third party libraries (see 
> [DocFormats/3rdparty](
> https://github.com/uxproductivity/DocFormats/tree/master/DocFormats/3r
> dparty)) to which minor modifications have been made.
> - **minizip**, a layer on top of zlib
> - **w3c-tidy-html5**, a html5 manipulation lib.
>
> Remark, currently the changes made to the original sources are 
> undocumented. The plan is to have the 1-1 sources in the repo (or have 
> users download it), and then apply a patch.
>
> Furthermore, Corinthia depends on
> - libxml2
> - zlib
> - SDL
>
> ## Cryptography
> Corinthia does not use cryptography nor are there plans to use it.<br> 
> In the medium term we would like to use digital signing of the 
> convenience binaries.
>
> ## Required Resources
> ### Mailing lists
> We would like to have:
>
>   dev@corinthia.incubator.apache.org for general discussions<br>
>   private@corinthia.incubator.apache.org for private discussions
>
> ### Subversion Directory
> We prefer not to use SVN.
>
> ### Git Repository
> Our current git repository is [github]( 
> https://github.com/uxproductivity/Corinthia).
>
> We would like the repo to be moved to the github apache organisation 
> under the name "corinthia" or "corinthia.incubator"
>
> ### Issue Tracking
> We are currently using github issues, which works fine, but would like 
> to change to Jira.
>
> ### Other Resources
> - Wiki: We are currently using github wiki, we would like to move to 
> an Apache supported wiki (preferable mediawiki), before our 
> documentations gets too complicated to move.
>
> - Buildbot: We would like to be able to build/test on OSX, Windows and 
> Ubuntu.
>
> - Web: We would like, if possible, to have the home page 
> corinthia.incubator.apache.org (just raw html please).
>
> - Blog: We would like, to have a blog, preferable wordpress.
>
>
> ## Initial Committers
> Our initial list of committers is not as long as we would have liked 
> it to be, but have not pushed for a larger community before becoming part of ASF.
>
> Dorte Fjalland, dorte@casacondor.com (ICLA mailed)<br> Jan Iversen, 
> jani@apache.org (ASF member++)<br> Louis Suárez-Potts, 
> louis@apache.org (AOO PMC)<br> Peter Kelly, peter@uxproductivity.com 
> (ICLA mailed)<br> Svante Schubert, svante.schubert@gmail.com (AOO 
> committer)
>
> ## Affiliations
> None
>
> ## Sponsors
> ### Champion
> Jan Iversen
>
> ### Nominated Mentors
> Jan Iversen.
>
> Since jan is involved in the project, it would be beneficial to have 
> at least a second mentor with a "outside" view, who can help focus on 
> the administrative logistics.
>
> ### Sponsoring Entity
> Incubator IPMC.
>
>
> ======
>
Mime
View raw message