lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James X <hello.nigerian.spamm...@gmail.com>
Subject Fwd: java.lang.RuntimeException: after flush: fdx size mismatch
Date Fri, 29 May 2009 03:08:27 GMT
An update:With assertions enabled, I still get fdx size mismatch exceptions
in the same manner as before.

Without assertions, the fdx size mismatch exception was followed by a chain
of LockObtainFailedExceptions. With assertions, there is also a chain of
NullPointerExceptions following the LockObtainFailedExceptions. This is the
only change in behaviour - stack trace for the NPEs below:

SEVERE: java.lang.NullPointerException
        at
org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:143)
        at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1204)
        at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303)
        at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232)

---------- Forwarded message ----------
From: James X <hello.nigerian.spammers@gmail.com>
Date: Thu, May 28, 2009 at 7:53 PM
Subject: Fwd: java.lang.RuntimeException: after flush: fdx size mismatch
To: solr-user@lucene.apache.org


My apologies for the delay in running this patched Lucene build - I was
temporarily pulled onto another piece of work.

Here is a sample 'fdx size mismatch' exception using the patch Mike
supplied:

SEVERE: java.lang.RuntimeException: after flush: fdx size mismatch: 1 docs
vs 0 length in bytes of _1i.fdx exists=false didInit=false inc=0 dSO=1
fieldsWriter.doClose=true fieldsWriter.indexFilePointer=12
fieldsWriter.fieldsFilePointer=2395
        at
org.apache.lucene.index.StoredFieldsWriter.closeDocStore(StoredFieldsWriter.java:96)
        at
org.apache.lucene.index.DocFieldConsumers.closeDocStore(DocFieldConsumers.java:83)
        at
org.apache.lucene.index.DocFieldProcessor.closeDocStore(DocFieldProcessor.java:47)
        at
org.apache.lucene.index.DocumentsWriter.closeDocStore(DocumentsWriter.java:367)
        at
org.apache.lucene.index.DocumentsWriter.flush(DocumentsWriter.java:567)
        at
org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3540)
        at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3450)
        at
org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:1638)
        at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1602)
        at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1578)
        at
org.apache.solr.update.SolrIndexWriter.close(SolrIndexWriter.java:153)


Will now run with assertions enabled and see how that affects the behaviour!

Thanks,
James

---------- Forwarded message ----------
From: James X <hello.nigerian.spammers@gmail.com>
Date: Thu, May 21, 2009 at 2:24 PM
Subject: Re: java.lang.RuntimeException: after flush: fdx size mismatch
To: solr-user@lucene.apache.org


Hi Mike,Documents are web pages, about 20 fields, mostly strings, a couple
of integers, booleans and one html field (for document body content).

I do have a multi-threaded client pushing docs to Solr, so yes, I suppose
that would mean I have several active Solr worker threads.

The only exceptions I have are the RuntimeException flush errors, followed
by a handful (normally 10-20) of LockObtainFailedExceptions, which i
presumed were being caused by the faulty threads dying and failing to
release locks.

Oh wait, I am getting WstxUnexpectedCharException exceptions every now and
then:
SEVERE: com.ctc.wstx.exc.WstxUnexpectedCharException: Illegal character
((CTRL-CHAR, code 8))
 at [row,col {unknown-source}]: [1,26070]
        at
com.ctc.wstx.sr.StreamScanner.throwInvalidSpace(StreamScanner.java:675)
        at
com.ctc.wstx.sr.BasicStreamReader.readTextSecondary(BasicStreamReader.java:4668)
        at
com.ctc.wstx.sr.BasicStreamReader.readCoalescedText(BasicStreamReader.java:4126)
        at
com.ctc.wstx.sr.BasicStreamReader.finishToken(BasicStreamReader.java:3701)
        at
com.ctc.wstx.sr.BasicStreamReader.safeFinishToken(BasicStreamReader.java:3649)
        at
com.ctc.wstx.sr.BasicStreamReader.getText(BasicStreamReader.java:809)
        at
org.apache.solr.handler.XmlUpdateRequestHandler.readDoc(XmlUpdateRequestHandler.java:327)

I presumed these were caused by character encoding issues, but haven't
looked into them at all yet.

Thanks again for your help! I'll make some time this afternoon to build some
patched Lucene jars and get the results


On Thu, May 21, 2009 at 5:06 AM, Michael McCandless <
lucene@mikemccandless.com> wrote:

> Another question: are there any other exceptions in your logs?  Eg
> problems adding certain documents, or anything?
>
> Mike
>
> On Wed, May 20, 2009 at 11:18 AM, James X
> <hello.nigerian.spammers@gmail.com> wrote:
> > Hi Mike, thanks for the quick response:
> >
> > $ java -version
> > java version "1.6.0_11"
> > Java(TM) SE Runtime Environment (build 1.6.0_11-b03)
> > Java HotSpot(TM) 64-Bit Server VM (build 11.0-b16, mixed mode)
> >
> > I hadn't noticed the 268m trigger for LUCENE-1521 - I'm definitely not
> > hitting that yet!
> >
> > The exception always reports 0 length, but the number of of docs varies,
> > heavily weighted towards 1 or two docs. Of the last 130 or so exceptions:
> >     89 1 docs vs 0 length
> >     20 2 docs vs 0 length
> >      9 3 docs vs 0 length
> >      1 4 docs vs 0 length
> >      3 5 docs vs 0 length
> >      2 6 docs vs 0 length
> >      1 7 docs vs 0 length
> >      1 9 docs vs 0 length
> >      1 10 docs vs 0 length
> >
> > The only unusual thing I can think of that we're doing with Solr is
> > aggressively CREATE-ing and UNLOAD-ing cores. I've not been able to spot
> a
> > pattern between core admin operations and these exceptions, however...
> >
> > James
> >
> > On Wed, May 20, 2009 at 2:37 AM, Michael McCandless <
> > lucene@mikemccandless.com> wrote:
> >
> >> Hmm... somehow Lucene is flushing a new segment on closing the
> >> IndexWriter, and thinks 1 doc had been added to the stored fields
> >> file, yet the fdx file is the wrong size (0 bytes).  This check (&
> >> exception) are designed to prevent corruption from entering the index,
> >> so it's at least good to see CheckIndex passes after this.
> >>
> >> I don't think you're hitting LUCENE-1521: that issue only happens if a
> >> single segment has more than ~268 million docs.
> >>
> >> Which exact JRE version are you using?
> >>
> >> When you hit this exception, is it always "1 docs vs 0 length in bytes"?
> >>
> >> Mike
> >>
> >> On Wed, May 20, 2009 at 3:19 AM, James X
> >> <hello.nigerian.spammers@gmail.com> wrote:
> >> > Hello all,I'm running Solr 1.3 in a multi-core environment. There are
> up
> >> to
> >> > 2000 active cores in each Solr webapp instance at any given time.
> >> >
> >> > I've noticed occasional errors such as:
> >> > SEVERE: java.lang.RuntimeException: after flush: fdx size mismatch: 1
> >> docs
> >> > vs 0 length in bytes of _h.fdx
> >> >        at
> >> >
> >>
> org.apache.lucene.index.StoredFieldsWriter.closeDocStore(StoredFieldsWriter.java:94)
> >> >        at
> >> >
> >>
> org.apache.lucene.index.DocFieldConsumers.closeDocStore(DocFieldConsumers.java:83)
> >> >        at
> >> >
> >>
> org.apache.lucene.index.DocFieldProcessor.closeDocStore(DocFieldProcessor.java:47)
> >> >        at
> >> >
> >>
> org.apache.lucene.index.DocumentsWriter.closeDocStore(DocumentsWriter.java:367)
> >> >        at
> >> >
> org.apache.lucene.index.DocumentsWriter.flush(DocumentsWriter.java:567)
> >> >        at
> >> > org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3540)
> >> >        at
> >> org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3450)
> >> >        at
> >> >
> org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:1638)
> >> >        at
> >> org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1602)
> >> >        at
> >> org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1578)
> >> >        at
> >> > org.apache.solr.update.SolrIndexWriter.close(SolrIndexWriter.java:153)
> >> >
> >> > during commit / optimise operations.
> >> >
> >> > These errors then cause cascading errors during updates on the
> offending
> >> > cores:
> >> > SEVERE: org.apache.lucene.store.LockObtainFailedException: Lock obtain
> >> timed
> >> > out: SingleInstanceLock: write.lock
> >> >        at org.apache.lucene.store.Lock.obtain(Lock.java:85)
> >> >        at
> org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1070)
> >> >        at
> >> org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:924)
> >> >        at
> >> >
> org.apache.solr.update.SolrIndexWriter.<init>(SolrIndexWriter.java:116)
> >> >        at
> >> >
> >>
> org.apache.solr.update.UpdateHandler.createMainIndexWriter(UpdateHandler.java:122)
> >> >
> >> > This looks like http://issues.apache.org/jira/browse/LUCENE-1521, but
> >> when I
> >> > upgraded Lucene to 2.4.1 under Solr 1.3, the issue still remains.
> >> >
> >> > CheckIndex doesn't find any problems with the index, and problems
> >> disappear
> >> > after an (inconvenient, for me) restart of Solr.
> >> >
> >> > Firstly, can I as the symptoms are so close to those in 1521, can I
> check
> >> my
> >> > Lucene upgrade method should work:
> >> > - unzip the Solr 1.3 war
> >> > - remove the Lucene 2.4dev jars
> >> > (lucene-core, lucene-spellchecker, lucene-snowball, lucene-queries,
> >> > lucene-memory,lucene-highlighter, lucene-analyzers)
> >> > - move in the Lucene 2.4.1 jars
> >> > - rezip the directory structures as solr.war.
> >> >
> >> > I think this has worked, as solr/default/admin/registry.jsp shows:
> >> >  <lucene-spec-version>2.4.1</lucene-spec-version>
> >> >  <lucene-impl-version>2.4.1 750176 - 2009-03-04
> >> > 21:56:52</lucene-impl-version>
> >> >
> >> > Secondly, if this Lucene fix isn't the right solution to this problem,
> >> can
> >> > anyone suggest an alternative approach? The only problems I've had up
> to
> >> now
> >> > is to do with the number of allowed file handles, which was fixed by
> >> > changing limits.conf (RHEL machine).
> >> >
> >> > Many thanks!
> >> > James
> >> >
> >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message