incubator-cvs mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <>
Subject [Incubator Wiki] Update of "TikaProposal" by JukkaZitting
Date Sat, 03 Mar 2007 10:46:04 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Incubator Wiki" for change notification.

The following page has been changed by JukkaZitting:

The comment on the change is:
Updated draft (the important background and rationale sections still to go...)

  == Meritocracy ==
- ''TODO''
+ All the initial committers are familiar with the meritocracy principles of Apache, and have
already worked on the various source codebases. We will follow the normal meritocracy rules
also with other potential contributors.
  == Community ==
- ''TODO''
+ There is not yet a clear Tika community. Instead we have a number of people and related
projects with an understanding that a shared toolkit project would best serve everyone's interests.
The primary goal of the incubating project is to build a a self-sustaining community around
this shared vision.
  == Core Developers ==
- ''TODO''
+ The initial set of developers comes from various backgrounds, with different but compatible
needs for the proposed project.
  == Alignment ==
- ''TODO''
+ As a generic toolkit the Tika will likely be widely used by various open source and commercial
projects both together with and independent of other Apache tools like Lucene Java or Jakarta
POI. Other Apache projects like Nutch and Jackrabbit are potential candidates for using Tika
as an embedded component.
  = Known Risks =
  == Orphaned products ==
- ''TODO: There has been on-and-off interest in something like this for quite a while already.
How can we make sure that the current increase in interest doesn't fade away?''
+ There are a number of projects at various stages of maturity that implement a subset of
the proposed features in Tika. For many potential users the existing tools are already enough,
which reduces the demand for a more generic toolkit. This can also be seen in the slow progress
of this proposal over the past year.
+ However, once the project gets started we can quickly reach the feature level of existing
tools based on seed code from sources mentioned below. After that we believe to be able to
quickly grow the developer and user communities based on the benefits of a generic toolkit
over custom alternatives.
  == Inexperience with Open Source ==
- ''TODO: Many of the interested participants have open source background.''
+ All the initial developers have worked on open source before and many are committers and
PMC members within other Apache projects.
  == Homogenous Developers ==
- ''TODO: There is no central company behind the proposal.''
+ The initial developers come from a variety of backgrounds and with a variety of needs for
the proposed toolkit.
  == Reliance on Salaried Developers ==
- ''TODO: Some of us are salaried for this, other's are not.''
+ Some of the developers are paid to work on this or related projects, but the proposed project
is not the primary task for anyone.
  == Relationships with Other Apache Products ==
@@ -87, +89 @@

  == A Excessive Fascination with the Apache Brand ==
- ''TODO''
+ All of us are familiar with Apache and we have participated in Apache projects as contributors,
committers, and PMC members. We feel that the Apache Software Foundation is a natural home
for a project like this.
  = Documentation =
@@ -106, +108 @@

  Tika will start with a combination of seed code from the efforts listed below:
+  * The [ Apache Nutch] project, that contains another parser
framework and various content analysis tools
+  * The [ Lius project], an indexing framework for Apache
   * The [ Tika project at Google Code], where some initial
draft code has been developed for this proposed project
-  * The [ Lius project], an indexing framework for Apache
-  * The [ Apache Nutch] project, that contains another parser
framework and various content analysis tools
  No existing codebase is selected as "the" starting point of Tika to avoid inheriting the
world view and design limitations of any single project.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message