lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "RAYMOND Romain" <romain.raym...@c-s.fr>
Subject Re: Lucene with Number+Text
Date Tue, 26 Mar 2002 10:26:46 GMT

Yeah, but I just use WhiteSpaceAnalyser to parse my query ...
but the "text" is indexed using an analyser (standard ... I think)
respecting ponctuation (, . ???)
it is good for my app at this time

Joe Hajek a écrit :
> 
> the problem with whitespaceanalyzer is that if you have for example a sentence in the
text say "lucene is indexing." a query for "indexing" will produce no hits because "." is
not a token delimiter. you will have to search for "indexing*".
> 
> for me the solution was to write my own tokenizer/analyzer pair
> 
> --snip
> 
> and
> --snip
> 
> public final class myTokenizer extends CharTokenizer {
> /** Construct a new LowerCaseTokenizer. */
> public myTokenizer(Reader in) {
> super(in);
> }
> 
> /** Collects only characters which satisfy
> * {@link Character#isLetter(char)}.*/
> protected char normalize(char c) {
> return Character.toLowerCase(c);
> }
> 
> /** Collects only characters which do not satisfy
> * {@link Character#isWhitespace(char)}.*/
> protected boolean isTokenChar(char c) {
> return Character.isLetterOrDigit(c);
> }
> }
> public final class myAnalyzer extends Analyzer {
> public final TokenStream tokenStream(String fieldName, Reader reader) {
> return new myTokenizer(reader);
> }
> }
> --snip
> 
> regards joe
> 
> "RAYMOND Romain" <romain.raymond@c-s.fr> writes on
> Tue, 26 Mar 2002 08:53:51 +0100 (MET):
> 
> > hello,
> >
> > The solution we adopted is to use WhiteSpaceAnalyser.
> > If you print the result of a query after parsing it (with parse
> > method)
> > the tokenizers used delete the numbers from the query.
> > But WhiteSpaceAnalyser only tokenizes based on ... spaces, so we can
> > search on numbers values ....
> >
> > --
> > To unsubscribe, e-mail:
> > <mailto:lucene-user-unsubscribe@jakarta.apache.org>
> > For additional commands, e-mail:
> > <mailto:lucene-user-help@jakarta.apache.org>
> >
> 
> --
> To unsubscribe, e-mail:   <mailto:lucene-user-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>

--
To unsubscribe, e-mail:   <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>


Mime
View raw message