lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Rowe <sar...@gmail.com>
Subject Re: Which token filter can combine 2 terms into 1?
Date Fri, 21 Dec 2012 07:34:53 GMT
Hi David,

Not very many people read this mailing list - I suggest you switch to the java-user list -
see <http://lucene.apache.org/core/discussion.html>.

SingleFilter and CommonGramsFilter combine terms, though the conditions under which they do
so don't appear to be the same as what you want.

Why are only the second two terms combined?

Steve

On Dec 21, 2012, at 2:27 AM, Xi Shen <davidshen84@gmail.com> wrote:

> Hi,
> 
> I am looking for a token filter that can combine 2 terms into 1? E.g.
> 
> the input has been tokenized by white space:
> 
> t1 t2 t2a t3
> 
> I want a filter that output:
> 
> t1 t2t2a t3
> 
> I know it is a very special case, and I am thinking about develop a filter
> of my own. But I cannot figure out which API I should use to look for terms
> in a Token Stream.
> 
> 
> -- 
> Regards´╝î
> David Shen
> 
> http://about.me/davidshen
> https://twitter.com/#!/davidshen84


Mime
View raw message