lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Luke Forehand (JIRA)" <>
Subject [jira] Created: (SOLR-1883) Highlighting failure caused by InvalidTokenOffsetsException
Date Tue, 13 Apr 2010 17:37:19 GMT
Highlighting failure caused by InvalidTokenOffsetsException

                 Key: SOLR-1883
             Project: Solr
          Issue Type: Bug
          Components: highlighter
    Affects Versions: 1.4
         Environment: {code:title=java}
Java(TM) SE Runtime Environment (build 1.6.0_18-b07)
Java HotSpot(TM) 64-Bit Server VM (build 16.0-b13, mixed mode)
{code:title=solr lib manifest}
Manifest-Version: 1.0
Ant-Version: Apache Ant 1.7.0
Created-By: 14.1-b02-90 (Apple Inc.)
Extension-Name: org.apache.solr
Specification-Title: Apache Solr Search Server
Specification-Version: 1.4.0
Specification-Vendor: The Apache Software Foundation
Implementation-Title: org.apache.solr
Implementation-Version: 1.4.0 833479 - grantingersoll - 2009-11-06 12:
Implementation-Vendor: The Apache Software Foundation
X-Compile-Source-JDK: 1.5
X-Compile-Target-JDK: 1.5
Linux myhost 2.6.18-164.el5 #1 SMP Thu Sep 3 03:28:30 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
            Reporter: Luke Forehand

This issue seems to be the same as a previous issue that was bulk closed in solr 1.4,
and I see someone reported this bug in lucene 2.9.1
We are experiencing this issue as well.  

I have pasted the important part of our schema.xml and the solr exception.  I have also attached
the document that fails when queried as a highlight query.  The invalid token seems to be
'system' which is the very last token in the document field if you look at the attached file.

<?xml version="1.0" encoding="UTF-8"?>

<schema name="xxx" version="1.1">


		<fieldType name="scrubbedText" class="solr.TextField" positionIncrementGap="100">
				<tokenizer class="solr.StandardTokenizerFactory" />
				<charFilter class="solr.HTMLStripCharFilterFactory" />
				<filter class="solr.StandardFilterFactory" />
				<filter class="solr.LowerCaseFilterFactory" />
				<filter class="solr.StopFilterFactory" />

		<field name="id" type="string" stored="true" indexed="true" />
		<field name="textScrubbed" type="scrubbedText" stored="true" indexed="true" />



{code:title=solr.log exception}
Apr 13, 2010 3:08:35 AM org.apache.solr.common.SolrException log
SEVERE: org.apache.solr.common.SolrException:
Token system exceeds length of provided text sized 17063
        at org.apache.solr.highlight.DefaultSolrHighlighter.doHighlighting(
        at org.apache.solr.handler.component.HighlightComponent.process(
        at org.apache.solr.handler.component.SearchHandler.handleRequestBody(
        at org.apache.solr.handler.RequestHandlerBase.handleRequest(
        at org.apache.solr.core.SolrCore.execute(
        at org.apache.solr.servlet.SolrDispatchFilter.execute(
        at org.apache.solr.servlet.SolrDispatchFilter.doFilter(
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(
        at org.apache.catalina.core.StandardWrapperValve.invoke(
        at org.apache.catalina.core.StandardContextValve.invoke(
        at org.apache.catalina.core.StandardHostValve.invoke(
        at org.apache.catalina.valves.ErrorReportValve.invoke(
        at org.apache.catalina.core.StandardEngineValve.invoke(
        at org.apache.catalina.connector.CoyoteAdapter.service(
        at org.apache.coyote.http11.Http11AprProcessor.process(
        at org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.process(
Caused by: Token system exceeds
length of provided text sized 17063
        at org.apache.solr.highlight.DefaultSolrHighlighter.doHighlighting(
        ... 18 more

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators:
For more information on JIRA, see:


View raw message