lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <>
Subject Re: [jira] Lius into apache incubator
Date Thu, 01 Mar 2007 17:34:41 GMT
Is the Droids lab at all related to that parsing project in Nutch?   
There seems to be several efforts that are related here that could  
probably make for a nice new project under Lucene, IMO.  They all  
seem to have to do with  getting and preparing text for processing by  
some type of consumer of text.

I sometimes wonder if the Analysis stuff in Lucene proper would  
benefit from moving out of core too, but I'm not sure what it would  
look like just yet and it is nice having it "optimized" for Lucene  
versus having to support other types of analysis phases.

Just my two cents,

On Mar 1, 2007, at 11:42 AM, Jukka Zitting wrote:

> Hi,
> On 3/1/07, Rida Benjelloun <> wrote:
>> Lius could be used as a starting point of Tika project, if Tika  
>> committers
>> are interested on it. We can also as mark said decouple Lius's  
>> parser logic
>> from it's indexing logic.
> I'm very interested in doing that. Another very useful codebase, among
> others, would be the existing parser framework in the Nutch project.
>> Taking the project into Apache incubator could be also  
>> interesting, to get
>> more people involved on it.
> Exactly. I'd like to avoid starting just yet another codebase, and
> focus more on bringing the best parts (both code and ideas) of the
> existing projects together. The community-building focus of the
> Incubator would likely help with that. Another aspect that would
> benefit from the Incubator scrutiny are the legal implications of
> pulling together multiple document parser libraries under various
> different licenses.
> Would there be interest within the Lucene PMC in sponsoring a proposal
> along such lines? I can volunteer to put together the proposal and act
> as the champion and mentor of the project.
> BR,
> Jukka Zitting
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

Grant Ingersoll
Center for Natural Language Processing

Read the Lucene Java FAQ at 

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message