lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rida Benjelloun" <rida.benjell...@doculibre.com>
Subject Re: [jira] Lius into apache incubator
Date Thu, 01 Mar 2007 15:29:40 GMT
Hi,
You could actually use Lius as text extraction API, I have implement for
each Indexer a method that allows you to get the String content of the
Document.
Lius could be used as a starting point of Tika project, if Tika committers
are interested on it. We can also as mark said decouple Lius's parser logic
from it's indexing logic.
Taking the project into Apache incubator could be also interesting, to get
more people involved on it.

My goal is to join our effort to build a framework for text extraction.
Here is an example of text extraction with lius :

LiusConfig lc =
LiusConfigBuilder.getSingletonInstance().getLiusConfig(liusConfigPathString);

Indexer indexer = IndexerFactory.getIndexer(documentToIndex, lc);
String text = Indexer.getContent();


On 3/1/07, Jukka Zitting <jukka.zitting@gmail.com> wrote:
>
>
> Hi,
>
> I am interested in a Lius/Tika project that could be used not only with
> Lucene. As mentioned by Mark, there are a number of related efforts which
> leads me to believe a application-independent content analysis/parsing
> tool
> would be very helpful for many users.
>
> I'd like to propose taking the project to the Apache Incubator to better
> attract interest also from outside Lucene.
>
> BR,
>
> Jukka Zitting
>
> --
> View this message in context:
> http://www.nabble.com/Lius-into-apache-incubator-tf3145937.html#a9247508
> Sent from the Lucene - Java Developer mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message