lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christopher Gross <cogr...@gmail.com>
Subject Solr Exceptions -- "immense terms"
Date Mon, 15 Sep 2014 14:06:01 GMT
Solr 4.9.0
Java 1.7.0_49

I'm indexing an internal Wiki site.  I was running on an older version of
Solr (4.1) and wasn't having any trouble indexing the content, but now I'm
getting errors:

SCHEMA:
<field name="content" type="string" indexed="false" stored="true"
required="true"/>

LOGS:
Caused by: java.lang.IllegalArgumentException: Document contains at least
one immense term in field="content" (whose UTF8 encoding is longer than the
max length 32766), all of which were skipped.  Please correct the analyzer
to not produce such terms.  The prefix of the first immense term is: '[60,
33, 45, 45, 32, 98, 111, 100, 121, 67, 111, 110, 116, 101, 110, 116, 32,
45, 45, 62, 10, 9, 9, 9, 60, 100, 105, 118, 32, 115]...', original message:
bytes can be at most 32766 in length; got 183250
....
Caused by:
org.apache.lucene.util.BytesRefHash$MaxBytesLengthExceededException: bytes
can be at most 32766 in length; got 183250

I was indexing it, but I switched that off (as you can see above) but it
still is having problems.  Is there a different type I should use, or a
different analyzer?  I imagine that there is a way to index very large
documents in Solr.  Any recommendations would be helpful.  Thanks!

-- Chris

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message