lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Parvulescu <alex.parvule...@gmail.com>
Subject Question about the CompoundWordTokenFilterBase
Date Wed, 18 Sep 2013 14:27:23 GMT
Hi,

While trying to play with the CompoundWordTokenFilterBase I noticed that
the behavior is to include the original token together with the new
sub-tokens.

I assume this is expected (haven't found any relevant docs on this), but I
was wondering if it's a hard requirement or can I propose a small change to
skip the original token (controlled by a flag)?

If there's interest I can put this in a JIRA issue and we can continue the
discussion there.

The patch is not too complicated, but I haven't ran any of the tests yet :)

thanks,
alex

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message