Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm
Precedence: bulk
Reply-To: java-user@lucene.apache.org
Received-SPF: neutral (athena.apache.org: local policy)
Message-Id: <46392609-A138-4546-8DBC-7839801E5280@mikemccandless.com>
From: Michael McCandless <lucene@mikemccandless.com>
To: java-user@lucene.apache.org
In-Reply-To: <126142c0809260844x3551cfaaj21863356c5a98ba3@mail.gmail.com>
Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
Content-Transfer-Encoding: 7bit
Mime-Version: 1.0 (Apple Message framework v929.2)
Subject: Re: Caused by: java.io.IOException: read past EOF on Slave
Date: Fri, 26 Sep 2008 12:41:44 -0400
References: <19682684.post@talk.nabble.com>
 <BF8A2B92-035C-483A-B955-447BDB0C1C89@mikemccandless.com>
 <126142c0809260844x3551cfaaj21863356c5a98ba3@mail.gmail.com>


This one looks spooky!

Is it easily repeated?  If you could print out which 2 terms you had  
tried to delete, and then zip up the index just before deleting those  
docs (after closing the writer) and send to me, I can try to  
understand what's wrong with the index.  It looks as if the *.tis file  
for one of the segments is truncated.

If you capture the series of add/update/delete documents, can you get  
a standalone Java test to show this?

Does this test create an entirely new index?

We did change the index format in 2.4 to use "true" UTF8 encoding for  
all text content; not sure that this applies here (to  
BufferedIndexReader it's all bytes) but it may.

BufferedIndexReader in general can do random IO, especially when  
reading the term dict file (*.tis), when you

Mike

Marcelo Ochoa wrote:

> Michael:
>  I just start testing 2.4rc2 running inside OJVM.
> I found a similar stack trace during indexing:
> IW 3 [Root Thread]:   flush: segment=_3 docStoreSegment=_3
> docStoreOffset=0 flushDocs=true flushDeletes=true flushDocStores=false
> numDocs=2 numBufDelTerms=2
> IW 3 [Root Thread]:   index before flush _1:C2->_1 _2:C2->_2
> IW 3 [Root Thread]: DW: flush postings as segment _3 numDocs=2
> IW 3 [Root Thread]: DW:   oldRAMSize=111616 newFlushedSize=264
> docs/MB=7,943.758 new/old=0.237%
> IW 3 [Root Thread]: DW: apply 2 buffered deleted terms and 0 deleted
> docIDs and 0 deleted queries on 3 segments.
> IW 3 [Root Thread]: hit exception flushing deletes
> Exception in thread "Root Thread" java.io.IOException: read past EOF
>        at  
> org 
> .apache 
> .lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java)
>        at  
> org 
> .apache 
> .lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java)
>        at  
> org 
> .apache 
> .lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java)
>        at org.apache.lucene.index.TermBuffer.read(TermBuffer.java)
>        at  
> org.apache.lucene.index.SegmentTermEnum.next(SegmentTermEnum.java)
>        at  
> org.apache.lucene.index.SegmentTermEnum.scanTo(SegmentTermEnum.java)
>        at  
> org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java)
>        at  
> org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java)
>        at  
> org.apache.lucene.index.SegmentTermDocs.seek(SegmentTermDocs.java)
>        at  
> org.apache.lucene.index.IndexReader.termDocs(IndexReader.java)
>        at  
> org 
> .apache 
> .lucene.index.DocumentsWriter.applyDeletes(DocumentsWriter.java)
>        at  
> org 
> .apache 
> .lucene.index.DocumentsWriter.applyDeletes(DocumentsWriter.java:918)
>        at  
> org.apache.lucene.index.IndexWriter.applyDeletes(IndexWriter.java)
>        at  
> org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java)
>        at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java)
>        at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java)
>        at  
> org 
> .apache.lucene.indexer.LuceneDomainIndex.sync(LuceneDomainIndex.java: 
> 1308)
>
>  I'll reinstall with a full debug info to see all line numbers in
> Lucene java code.
>  Is there a list of semantic changes at BufferedIndeInput code?
>  I mean it do sequential or random writes for example.
>  But anyway, I just compiled with latest code and ran my test suites,
> I'll investigate the problem a bit more.
>  Best regards, Marcelo.
>
> On Fri, Sep 26, 2008 at 7:32 AM, Michael McCandless
> <lucene@mikemccandless.com> wrote:
>>
>> Can you describe the sequence of steps that your replication  
>> process goes
>> through?
>>
>> Also, which filesystem is the index being accessed through?
>>
>> Mike
>>
>> rahul_k123 wrote:
>>
>>>
>>> First of all, thanks to all the people who helped me in getting  
>>> the lucene
>>> replication setup working and right now its live in our  
>>> production :-)
>>>
>>> Everything working fine, except that i am seeing some exceptions on
>>> slaves.
>>>
>>> The following is the one which is occuring more often on slaves
>>>
>>> at
>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java: 
>>> 441)
>>>      at
>>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>>>      at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>>>      at
>>>
>>> java.util.concurrent.ThreadPoolExecutor 
>>> $Worker.runTask(ThreadPoolExecutor.java:885)
>>>      at
>>>
>>> java.util.concurrent.ThreadPoolExecutor 
>>> $Worker.run(ThreadPoolExecutor.java:907)
>>>      at java.lang.Thread.run(Thread.java:619)
>>> Caused by: com.IndexingException: [SYSTEM_ERROR] Cannot access index
>>> [data_dir/index]: [read past EOF]
>>>      at
>>>
>>> com 
>>> .lucene 
>>> .LuceneSearchService.getSearchResults(LuceneSearchService.java:964)
>>>      ... 12 more
>>> Caused by: java.io.IOException: read past EOF
>>>      at
>>>
>>> org 
>>> .apache 
>>> .lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:146)
>>>      at
>>>
>>> org 
>>> .apache 
>>> .lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java: 
>>> 38)
>>>      at org.apache.lucene.store.IndexInput.readInt(IndexInput.java: 
>>> 66)
>>>      at  
>>> org.apache.lucene.store.IndexInput.readLong(IndexInput.java:89)
>>>      at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java: 
>>> 147)
>>>      at
>>> org.apache.lucene.index.SegmentReader.document(SegmentReader.java: 
>>> 659)
>>>      at
>>>
>>> org 
>>> .apache 
>>> .lucene.index.MultiSegmentReader.document(MultiSegmentReader.java: 
>>> 257)
>>>      at
>>> org.apache.lucene.index.IndexReader.document(IndexReader.java:525)
>>>
>>> and the second one is
>>>
>>> at java.util.concurrent.Executors 
>>> $RunnableAdapter.call(Executors.java:441)
>>>      at
>>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>>>      at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>>>      at
>>>
>>> java.util.concurrent.ThreadPoolExecutor 
>>> $Worker.runTask(ThreadPoolExecutor.java:885)
>>>      at
>>>
>>> java.util.concurrent.ThreadPoolExecutor 
>>> $Worker.run(ThreadPoolExecutor.java:907)
>>>      at java.lang.Thread.run(Thread.java:619)
>>> Caused by: java.lang.IllegalArgumentException: attempt to access a  
>>> deleted
>>> document
>>>      at
>>> org.apache.lucene.index.SegmentReader.document(SegmentReader.java: 
>>> 657)
>>>      at
>>>
>>> org 
>>> .apache 
>>> .lucene.index.MultiSegmentReader.document(MultiSegmentReader.java: 
>>> 257)
>>>      at
>>> org.apache.lucene.index.IndexReader.document(IndexReader.java:525)
>>> This is on master index .
>>>
>>>
>>>
>>> Any help is appreciated
>>>
>>> Thanks.
>>>
>>> --
>>> View this message in context:
>>> http://www.nabble.com/Caused-by%3A-java.io.IOException%3A-read-past-EOF-on-Slave-tp19682684p19682684.html
>>> Sent from the Lucene - Java Users mailing list archive at  
>>> Nabble.com.
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>
>
>
> -- 
> Marcelo F. Ochoa
> http://marceloochoa.blogspot.com/
> http://marcelo.ochoa.googlepages.com/home
> ______________
> Do you Know DBPrism? Look @ DB Prism's Web Site
> http://www.dbprism.com.ar/index.html
> More info?
> Chapter 17 of the book "Programming the Oracle Database using Java &
> Web Services"
> http://www.amazon.com/gp/product/1555583296/
> Chapter 21 of the book "Professional XML Databases" - Wrox Press
> http://www.amazon.com/gp/product/1861003587/
> Chapter 8 of the book "Oracle & Open Source" - O'Reilly
> http://www.oreilly.com/catalog/oracleopen/
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org