lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Braun (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-10375) Stored text retrieved via StoredFieldVisitor on doc in the document cache over-estimates needed byte[]
Date Thu, 10 Aug 2017 21:30:00 GMT

    [ https://issues.apache.org/jira/browse/SOLR-10375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16122368#comment-16122368
] 

Michael Braun commented on SOLR-10375:
--------------------------------------

Also happens on an insert - the size * 3 overflows to be a negative value, which will pass
the MAX_UTF8_SIZE_FOR_ARRAY_GROW_STRATEGY.. :

{code}
Caused by: java.lang.ArrayIndexOutOfBoundsException: 51
        at org.apache.solr.common.util.ByteUtils.UTF16toUTF8(ByteUtils.java:94)
        at org.apache.solr.common.util.JavaBinCodec.writeStr(JavaBinCodec.java:805)
        at org.apache.solr.common.util.JavaBinCodec.writePrimitive(JavaBinCodec.java:897)
        at org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:323)
        at org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:223)
        at org.apache.solr.common.util.JavaBinCodec.writeSolrInputDocument(JavaBinCodec.java:589)
        at org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:350)
        at org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:223)
        at org.apache.solr.common.util.JavaBinCodec.writeMapEntry(JavaBinCodec.java:729)
        at org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:378)
        at org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:223)
        at org.apache.solr.common.util.JavaBinCodec.writeIterator(JavaBinCodec.java:670)
        at org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:362)
        at org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:223)
        at org.apache.solr.common.util.JavaBinCodec.writeNamedList(JavaBinCodec.java:218)
        at org.apache.solr.common.util.JavaBinCodec.writeKnownType(JavaBinCodec.java:325)
        at org.apache.solr.common.util.JavaBinCodec.writeVal(JavaBinCodec.java:223)
        at org.apache.solr.common.util.JavaBinCodec.marshal(JavaBinCodec.java:146)
        at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.marshal(JavaBinUpdateRequestCodec.java:83)
        at org.apache.solr.client.solrj.impl.BinaryRequestWriter.getContentStream(BinaryRequestWriter.java:67)
        at org.apache.solr.client.solrj.request.RequestWriter$LazyContentStream.getDelegate(RequestWriter.java:94)
        at org.apache.solr.client.solrj.request.RequestWriter$LazyContentStream.getName(RequestWriter.java:104)
        at org.apache.solr.client.solrj.impl.HttpSolrClient.createMethod(HttpSolrClient.java:389)
{code}

> Stored text retrieved via StoredFieldVisitor on doc in the document cache over-estimates
needed byte[]
> ------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-10375
>                 URL: https://issues.apache.org/jira/browse/SOLR-10375
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>         Environment: Java 1.8.121, Linux x64
>            Reporter: Michael Braun
>            Priority: Minor
>
> Using SolrIndexSearcher.doc(int n, StoredFieldVisitor visitor)   (as can happen with
the UnifiedHighlighter in particular)
> If the document cache has the document, will call visitFromCached, will get an out of
memory error because of line 752 of SolrIndexSearcher - visitor.stringField(info, f.stringValue().getBytes(StandardCharsets.UTF_8));
> {code}
>  at java.lang.OutOfMemoryError.<init>()V (OutOfMemoryError.java:48)
>   at java.lang.StringCoding.encode(Ljava/nio/charset/Charset;[CII)[B (StringCoding.java:350)
>   at java.lang.String.getBytes(Ljava/nio/charset/Charset;)[B (String.java:941)
>   at org.apache.solr.search.SolrIndexSearcher.visitFromCached(Lorg/apache/lucene/document/Document;Lorg/apache/lucene/index/StoredFieldVisitor;)V
(SolrIndexSearcher.java:685)
>   at org.apache.solr.search.SolrIndexSearcher.doc(ILorg/apache/lucene/index/StoredFieldVisitor;)V
(SolrIndexSearcher.java:652)
> {code}
> This is due to the current String.getBytes(Charset) implementation, which allocates the
underlying byte array as a function of charArrayLength*maxBytesPerCharacter, which for UTF-8
is 3.  3 * 716MB is over Integer.MAX, and the JVM cannot allocate over this, so an out of
memory exception is thrown because the allocation of this much memory for a single array is
currently impossible.
> The problem is not present when the document cache is disabled.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message