lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: parsing Java log file with Lucene 3.0.3
Date Tue, 04 Jan 2011 12:49:28 GMT
Lucene In Action has an example of creating a synonymanalyzer that
you can adapt. The general idea is to subclass from Analyzer and
implement the required functions, perhaps wrapping a Tokenizer
in a bunch of Filters.

You might be able to crib some ideas from
solr.analysis.WordDelimiterFilter
Best
Erick



On Tue, Jan 4, 2011 at 6:23 AM, Benzion G <benzionk@yahoo.com> wrote:

>
> Problem with SimpleAnalyzer! It ignores digits.
>
> For text "customer 123 found" it will take only "customer" and "found", but
> will ignore "123". StandardAnalyzer handles OK the digits but has the dots
> problem, I mentioned before.
>
> Is there an understandable guide how to write my own Analyzer - a hybrid of
> StandardAnalyzer and SimpleAnalyzer?
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/parsing-Java-log-file-with-Lucene-3-0-3-tp2173046p2190856.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message