lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hoss Man (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-3961) LimitTokenCountFilterFactory config parsing is totally broken
Date Wed, 17 Oct 2012 19:10:04 GMT

    [ https://issues.apache.org/jira/browse/SOLR-3961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13478256#comment-13478256
] 

Hoss Man commented on SOLR-3961:
--------------------------------

>From the list...

{noformat}
I read the following in the example solrconfig:

 <!-- maxFieldLength was removed in 4.0. To get similar behavior, include a
         LimitTokenCountFilterFactory in your fieldType definition. E.g.
     <filter class="solr.LimitTokenCountFilterFactory" maxTokenCount="10000"/>
    -->

I tried that as follows:

...
<fieldType name="textgen" class="solr.TextField" positionIncrementGap="100">
      <analyzer type="index">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.LimitTokenCountFilterFactory"
maxTokenCount="100000"/>
        <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="0"
catenateNumbers="0" catenateAll="0" splitOnCaseChange="0"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.SnowballPorterFilterFactory" language="German"
/>
        <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt" enablePositionIncrements="true" />
        <filter class="solr.RemoveDuplicatesTokenFilterFactory" />
      </analyzer>
...

The LimitTokenCountFilterFactory configured like that crashes the startup
of the corresponding core with the following exception (without the Factory
the core startup works):


17.10.2012 17:44:19 org.apache.solr.common.SolrException log
SCHWERWIEGEND: null:org.apache.solr.common.SolrException: Plugin init
failure for [schema.xml] fieldType "textgen": Plugin init failure for
[schema.xml] analyze
r/filter: null
        at
org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:177)
        at
...
{noformat}

And as Jack noted...

{noformat}
Anybody want to guess what's wrong with this code:

String maxTokenCountArg = args.get("maxTokenCount");
if (maxTokenCountArg == null) {
 throw new IllegalArgumentException("maxTokenCount is mandatory.");
}
maxTokenCount = Integer.parseInt(args.get(maxTokenCountArg));

Hmmm... try this "workaround":

    <filter class="solr.LimitTokenCountFilterFactory" maxTokenCount="foo" foo="10000"/>
{noformat}

                
> LimitTokenCountFilterFactory config parsing is totally broken
> -------------------------------------------------------------
>
>                 Key: SOLR-3961
>                 URL: https://issues.apache.org/jira/browse/SOLR-3961
>             Project: Solr
>          Issue Type: Bug
>    Affects Versions: 4.0
>            Reporter: Hoss Man
>            Assignee: Hoss Man
>             Fix For: 4.0.1, 4.1
>
>
> As noted on the mailing list, LimitTokenCountFilterFactory throws a NumberFormatException
because it tries to use the value of it's config param as a key to look up another param that
it parses as an integer ... totally ridiculous.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message