lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Muir <>
Subject Re: Document aware analyzers was Re: deprecating Versions
Date Wed, 01 Dec 2010 13:07:51 GMT
On Wed, Dec 1, 2010 at 8:01 AM, Grant Ingersoll <> wrote:

> While we are at it, how about we make the Analysis process document aware instead of
Field aware?  The PerFieldAnalyzerWrapper, while doing exactly what it says it does, is just
silly.  If you had an analysis process that was aware, if it chooses to be, of the document
as a whole then you open up a whole lot more opportunity for doing interesting analysis while
losing nothing towards the individual treatment of fields.  The TeeSink stuff is an attempt
at this, but it is not sufficient.

I'm not sure I like this: traditionally we let the user application
deal with "document parsing" (how do you take your content and define
it as documents/fields).

If we want to change lucene to start dealing with this "document
parsing" aspect, thats pretty scary in itself, but in my opinion the
very last choice of where we would want to add something like that is
analysis! So personally I really like analysis being separate from
document parsing: our analysis API is already way too complicated.

Maybe if you give a concrete example then I would have a better
understanding of the problem you think this might solve.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message