lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Herbert Wu" <...@welpine.com>
Subject RE: Special characher & ; : % index/search question
Date Mon, 24 Jul 2006 02:24:09 GMT


WhitespaceAnalyzer looks brutal. Is it possible that I keep StandardAnalyzer
and at the same time to tell the parser to keep a list of chars during
indexing?
-Herbert

-----Original Message-----
From: Erick Erickson [mailto:erickerickson@gmail.com] 
Sent: Sunday, July 23, 2006 10:56 AM
To: java-user@lucene.apache.org
Subject: Re: Special characher & ; : % index/search question

the WhitespaceAnalyzer breaks up streams on whitespace, and will give you
these characters as tokens. Be careful to use it for indexing AND searching.
Also, make sure that's the analyzer in Luke if you submit queries that way
(it's a drop-down on the search page, upper right as I remember).

On 7/22/06, Herbert Wu <hwu@welpine.com> wrote:
>
> Hi, all,
>
> My document's title field contains standalone(not contained inside a word)
> special char such as &,:,%,; etc. With luke0.6 tool, I found that these
> chars are not indexed in the title field or any other place and hence not
> searchable. Is there any way to index these special chars for search? My
> env
> are:
>
> Lucene: version 2.0.0
>
> Index parser: org.apache.lucene.analysis.standard.StandardAnalyzer
>
> JDK: Java1.5
>
> OS: XP sp2
>
> Debugger: luke0.6
>
>
>
> Any help is greatly appreciated!
>
>
>
> -Herbert
>
>
>
>
>
>
>
>
>
>
>



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message