lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson (JIRA)" <>
Subject [jira] [Resolved] (SOLR-293) Add "minPartLength" to WordDelimiterFilter
Date Sun, 17 Feb 2013 17:07:12 GMT


Erick Erickson resolved SOLR-293.

    Resolution: Won't Fix

Cleaning up old JIRAs, re-open if necessary.
> Add "minPartLength" to WordDelimiterFilter
> ------------------------------------------
>                 Key: SOLR-293
>                 URL:
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>    Affects Versions: 1.3
>            Reporter: Mike Klaas
>            Assignee: Mike Klaas
>            Priority: Minor
> WDF is handy but over-tokenizes when faced with short word parts:
> A9
> R2D2
> mp3
> This creates one- or two- character tokens which are extremely slow to query as the doc
freq is so high (this is contributing to a significant portion of our slowest queries).
> This patch adds a "minPartLength" option that disables generation of parts below a certain
length.  It is recommended to use it with catenateAll, so as to not lose tokens.
> I'll add factory options and tests if we decide to include this (and are happy with the
parameter name).

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message