lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allen Atamer" <aata...@casebank.com>
Subject RE: implementing a TokenFilter for aliases
Date Fri, 05 Dec 2003 18:40:17 GMT
173 is the ID field from a database (which we use as a primary key). For
Lucene's purpose, it only stores the field, and does not index it.

The place where I put the print statements is before the actual filtering.
The goal of the AliasFilter is to replace spitline. The debug line is in the
Tokenizer, and the filters are run afterwards so I am not sure what is
happening inside lucene.

I can't put the util line into the analyzer after the AliasFilter is run
because it will call recursively into tokenStream() and cause a stack
overflow. I will try to work on seeing what is happening after aliasfilter
is run

Allen


> -----Original Message-----
> From: Erik Hatcher [mailto:erik@ehatchersolutions.com]
> Sent: December 5, 2003 12:23 PM
> To: Lucene Users List
> Subject: Re: implementing a TokenFilter for aliases
> 
> On Friday, December 5, 2003, at 11:59  AM, Allen Atamer wrote:
> > Below are the results of a debug run on the piece of text that I want
> > aliased. The token "spitline" must be recognized as "splitline" i.e.
> > when I
> > do a search for "splitline", this record will come up.
> >
> > 1: [173] , start:1, end:2
> > 1: [missing] , start:1, end:6
> > 2: [hardware] , start:9, end:7
> > 3: [for] , start:18, end:2
> > 4: [bypass] , start:22, end:5
> > 5: [spitline] , start:29, end:37
> >
> > I also added extra debug info after the token text, which are the
> > startOffset, and the endOffset. Lucene has the first token "173" only
> > stored, it is not indexed. The remaining terms are tokenized, indexed
> > and
> > stored. Does this make a difference?
> 
> I don't understand what you mean by "173" - is that output from a
> different string being analyzed?
> 
> Well, it's obvious from this output that you cannot find "spitline"
> when "splitline" is used in a search.  Your analyzer isn't working as
> you expect, I'm guessing.
> 
> 	Erik
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message