lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <>
Subject Re: Analyzer
Date Tue, 25 Nov 2008 16:08:29 GMT
I'm assuming that you want a different analyzer to handle extracting
the relevant information to put into a "text" field of the Lucene document.
I know of no way you can attach different analyzers to a single field.
You can certainly attach different analyzers to *different* fields...

The first thing I'd try would be to write your own analyzer that keeps
track of what kind of file it's currently analyzing and knows how to
"do the right thing" to extract the next token for the text field.

A cruder way would be to detect the type of document yourself,
extract the text into a string (or some such) then feed that into
your document.


On Tue, Nov 25, 2008 at 10:40 AM, Allahbaksh Mohammedali Asadullah <> wrote:

> HI All,
> I am indexing a set file type (html, js,jsp,xml etc). All the file type
> have a common field called as text. This field contains all the file data.
> Can I have different analyzer for depending upon file type.
> Note: I am indexing all file type with same indexer.
> Regards,
> Allahbaksh
> Allahbaksh Mohammedali Asadullah
> **************** CAUTION - Disclaimer *****************
> solely
> for the use of the addressee(s). If you are not the intended recipient,
> please
> notify the sender by e-mail and delete the original message. Further, you
> are not
> to copy, disclose, or distribute this e-mail or its contents to any other
> person and
> any such actions are unlawful. This e-mail may contain viruses. Infosys has
> taken
> every reasonable precaution to minimize this risk, but is not liable for
> any damage
> you may sustain as a result of any virus in this e-mail. You should carry
> out your
> own virus checks before opening the e-mail or attachment. Infosys reserves
> the
> right to monitor and review the content of all messages sent to or from
> this e-mail
> address. Messages sent to or from this e-mail address may be stored on the
> Infosys e-mail system.
> ***INFOSYS******** End of Disclaimer ********INFOSYS***

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message