lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Multiple Analyzer on Single field
Date Tue, 07 Apr 2009 13:19:53 GMT
Hmmmm. There's nothing in Lucene that I know of that will do what you
want, you'll have to do one of two things:

In general, you'll have to break up your token stream yourself, either
through pre-processing or building your own analyzers. There's
nothing already built that I know of that will break up, for instance,
KeyWordAnalyzer into three tokens.

Part of the confusion is the use of the phrase "keyword" as in
"and class should be a Key word". If I'm reading this right, you'll
want "class" to be in a separate field since it's special (in your
context). Again, to accomplish this you either need to pre-process
the input stream, extract "class", and put it in a separate field or
create your own analyzer that extracts only "class" from the
input stream. Then you'd feed the entire contents into *both* fields (say
"content" and "key"). The analyzer attached to the "content" field
(see PerFieldAnalyzerWrapper) would take care of breaking up
things like KeyWordAnalyzer, and the analyzer attached to the
"key" field would throw away everything except "class"..

Hope this helps
Erick

On Tue, Apr 7, 2009 at 8:57 AM, Allahbaksh Mohammedali Asadullah <
Allahbaksh_Asadullah@infosys.com> wrote:

> Hi All,
> Sorry for the confused email.
>
> Suppose I have a field text with content below
>
> KeyWordAnalyzer is a class. this keyword is used in java.
>
> Here the KeyWordAnalyzer into Key Word Analyzer and class should be a Key
> word. So if some one search. Apart from this I want Key Word Analzer to
> tokenized properly so that search become better.
> Regards,
> Allahbaksh
>
>
>
> -----Original Message-----
> From: Erick Erickson [mailto:erickerickson@gmail.com]
> Sent: Monday, April 06, 2009 9:31 PM
> To: java-user@lucene.apache.org
> Subject: Re: Multiple Analyzer on Single field
>
> This really doesn't make sense. KeywordAnalyzer will NOT
> tokenize the input stream. StandardAnalyzer WILL tokenize
> the input stream. I can't imagine what it means to do both at
> the same time.
>
> Perhaps you could give us some examples of what your desired
> inputs and outputs are we could steer you in the right direction.
>
> I suspect you're thinking more in terms of TokenFilters and/or
> Tokenizers...
>
> Best
> Erick
>
> On Mon, Apr 6, 2009 at 10:52 AM, Allahbaksh Mohammedali Asadullah <
> Allahbaksh_Asadullah@infosys.com> wrote:
>
> > Hi,
> > I want to add multiple Analyzer on single field. I want properties of
> > KeywordAnalyzer, SimpleAnalyzer, StandardAnalyzer, WhiteSpaceAnalyzer. Is
> > there any easy way to have all analyzer bundled on single field.
> > Regards,
> > Allahbaksh
> >
> >
> >
> >
> >
> >
> >
> > **************** CAUTION - Disclaimer *****************
> > This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended
> > solely
> > for the use of the addressee(s). If you are not the intended recipient,
> > please
> > notify the sender by e-mail and delete the original message. Further, you
> > are not
> > to copy, disclose, or distribute this e-mail or its contents to any other
> > person and
> > any such actions are unlawful. This e-mail may contain viruses. Infosys
> has
> > taken
> > every reasonable precaution to minimize this risk, but is not liable for
> > any damage
> > you may sustain as a result of any virus in this e-mail. You should carry
> > out your
> > own virus checks before opening the e-mail or attachment. Infosys
> reserves
> > the
> > right to monitor and review the content of all messages sent to or from
> > this e-mail
> > address. Messages sent to or from this e-mail address may be stored on
> the
> > Infosys e-mail system.
> > ***INFOSYS******** End of Disclaimer ********INFOSYS***
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message