Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 17269 invoked from network); 26 Sep 2008 16:42:31 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 26 Sep 2008 16:42:31 -0000 Received: (qmail 42709 invoked by uid 500); 26 Sep 2008 16:42:22 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 42683 invoked by uid 500); 26 Sep 2008 16:42:22 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 42672 invoked by uid 99); 26 Sep 2008 16:42:22 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 26 Sep 2008 09:42:22 -0700 X-ASF-Spam-Status: No, hits=1.6 required=10.0 tests=SPF_NEUTRAL,URIBL_GREY,WHOIS_MYPRIVREG X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [66.249.82.236] (HELO wx-out-0506.google.com) (66.249.82.236) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 26 Sep 2008 16:41:18 +0000 Received: by wx-out-0506.google.com with SMTP id h28so255752wxd.20 for ; Fri, 26 Sep 2008 09:41:51 -0700 (PDT) Received: by 10.70.11.1 with SMTP id 1mr1349218wxk.26.1222447309039; Fri, 26 Sep 2008 09:41:49 -0700 (PDT) Received: from ?10.17.4.4? ( [96.237.252.30]) by mx.google.com with ESMTPS id h17sm3197785wxd.37.2008.09.26.09.41.47 (version=TLSv1/SSLv3 cipher=RC4-MD5); Fri, 26 Sep 2008 09:41:48 -0700 (PDT) Message-Id: <46392609-A138-4546-8DBC-7839801E5280@mikemccandless.com> From: Michael McCandless To: java-user@lucene.apache.org In-Reply-To: <126142c0809260844x3551cfaaj21863356c5a98ba3@mail.gmail.com> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v929.2) Subject: Re: Caused by: java.io.IOException: read past EOF on Slave Date: Fri, 26 Sep 2008 12:41:44 -0400 References: <19682684.post@talk.nabble.com> <126142c0809260844x3551cfaaj21863356c5a98ba3@mail.gmail.com> X-Mailer: Apple Mail (2.929.2) X-Virus-Checked: Checked by ClamAV on apache.org This one looks spooky! Is it easily repeated? If you could print out which 2 terms you had tried to delete, and then zip up the index just before deleting those docs (after closing the writer) and send to me, I can try to understand what's wrong with the index. It looks as if the *.tis file for one of the segments is truncated. If you capture the series of add/update/delete documents, can you get a standalone Java test to show this? Does this test create an entirely new index? We did change the index format in 2.4 to use "true" UTF8 encoding for all text content; not sure that this applies here (to BufferedIndexReader it's all bytes) but it may. BufferedIndexReader in general can do random IO, especially when reading the term dict file (*.tis), when you Mike Marcelo Ochoa wrote: > Michael: > I just start testing 2.4rc2 running inside OJVM. > I found a similar stack trace during indexing: > IW 3 [Root Thread]: flush: segment=_3 docStoreSegment=_3 > docStoreOffset=0 flushDocs=true flushDeletes=true flushDocStores=false > numDocs=2 numBufDelTerms=2 > IW 3 [Root Thread]: index before flush _1:C2->_1 _2:C2->_2 > IW 3 [Root Thread]: DW: flush postings as segment _3 numDocs=2 > IW 3 [Root Thread]: DW: oldRAMSize=111616 newFlushedSize=264 > docs/MB=7,943.758 new/old=0.237% > IW 3 [Root Thread]: DW: apply 2 buffered deleted terms and 0 deleted > docIDs and 0 deleted queries on 3 segments. > IW 3 [Root Thread]: hit exception flushing deletes > Exception in thread "Root Thread" java.io.IOException: read past EOF > at > org > .apache > .lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java) > at > org > .apache > .lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java) > at > org > .apache > .lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java) > at org.apache.lucene.index.TermBuffer.read(TermBuffer.java) > at > org.apache.lucene.index.SegmentTermEnum.next(SegmentTermEnum.java) > at > org.apache.lucene.index.SegmentTermEnum.scanTo(SegmentTermEnum.java) > at > org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java) > at > org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java) > at > org.apache.lucene.index.SegmentTermDocs.seek(SegmentTermDocs.java) > at > org.apache.lucene.index.IndexReader.termDocs(IndexReader.java) > at > org > .apache > .lucene.index.DocumentsWriter.applyDeletes(DocumentsWriter.java) > at > org > .apache > .lucene.index.DocumentsWriter.applyDeletes(DocumentsWriter.java:918) > at > org.apache.lucene.index.IndexWriter.applyDeletes(IndexWriter.java) > at > org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java) > at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java) > at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java) > at > org > .apache.lucene.indexer.LuceneDomainIndex.sync(LuceneDomainIndex.java: > 1308) > > I'll reinstall with a full debug info to see all line numbers in > Lucene java code. > Is there a list of semantic changes at BufferedIndeInput code? > I mean it do sequential or random writes for example. > But anyway, I just compiled with latest code and ran my test suites, > I'll investigate the problem a bit more. > Best regards, Marcelo. > > On Fri, Sep 26, 2008 at 7:32 AM, Michael McCandless > wrote: >> >> Can you describe the sequence of steps that your replication >> process goes >> through? >> >> Also, which filesystem is the index being accessed through? >> >> Mike >> >> rahul_k123 wrote: >> >>> >>> First of all, thanks to all the people who helped me in getting >>> the lucene >>> replication setup working and right now its live in our >>> production :-) >>> >>> Everything working fine, except that i am seeing some exceptions on >>> slaves. >>> >>> The following is the one which is occuring more often on slaves >>> >>> at >>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java: >>> 441) >>> at >>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) >>> at java.util.concurrent.FutureTask.run(FutureTask.java:138) >>> at >>> >>> java.util.concurrent.ThreadPoolExecutor >>> $Worker.runTask(ThreadPoolExecutor.java:885) >>> at >>> >>> java.util.concurrent.ThreadPoolExecutor >>> $Worker.run(ThreadPoolExecutor.java:907) >>> at java.lang.Thread.run(Thread.java:619) >>> Caused by: com.IndexingException: [SYSTEM_ERROR] Cannot access index >>> [data_dir/index]: [read past EOF] >>> at >>> >>> com >>> .lucene >>> .LuceneSearchService.getSearchResults(LuceneSearchService.java:964) >>> ... 12 more >>> Caused by: java.io.IOException: read past EOF >>> at >>> >>> org >>> .apache >>> .lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:146) >>> at >>> >>> org >>> .apache >>> .lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java: >>> 38) >>> at org.apache.lucene.store.IndexInput.readInt(IndexInput.java: >>> 66) >>> at >>> org.apache.lucene.store.IndexInput.readLong(IndexInput.java:89) >>> at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java: >>> 147) >>> at >>> org.apache.lucene.index.SegmentReader.document(SegmentReader.java: >>> 659) >>> at >>> >>> org >>> .apache >>> .lucene.index.MultiSegmentReader.document(MultiSegmentReader.java: >>> 257) >>> at >>> org.apache.lucene.index.IndexReader.document(IndexReader.java:525) >>> >>> and the second one is >>> >>> at java.util.concurrent.Executors >>> $RunnableAdapter.call(Executors.java:441) >>> at >>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) >>> at java.util.concurrent.FutureTask.run(FutureTask.java:138) >>> at >>> >>> java.util.concurrent.ThreadPoolExecutor >>> $Worker.runTask(ThreadPoolExecutor.java:885) >>> at >>> >>> java.util.concurrent.ThreadPoolExecutor >>> $Worker.run(ThreadPoolExecutor.java:907) >>> at java.lang.Thread.run(Thread.java:619) >>> Caused by: java.lang.IllegalArgumentException: attempt to access a >>> deleted >>> document >>> at >>> org.apache.lucene.index.SegmentReader.document(SegmentReader.java: >>> 657) >>> at >>> >>> org >>> .apache >>> .lucene.index.MultiSegmentReader.document(MultiSegmentReader.java: >>> 257) >>> at >>> org.apache.lucene.index.IndexReader.document(IndexReader.java:525) >>> This is on master index . >>> >>> >>> >>> Any help is appreciated >>> >>> Thanks. >>> >>> -- >>> View this message in context: >>> http://www.nabble.com/Caused-by%3A-java.io.IOException%3A-read-past-EOF-on-Slave-tp19682684p19682684.html >>> Sent from the Lucene - Java Users mailing list archive at >>> Nabble.com. >>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org >>> For additional commands, e-mail: java-user-help@lucene.apache.org >>> >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org >> For additional commands, e-mail: java-user-help@lucene.apache.org >> >> > > > > -- > Marcelo F. Ochoa > http://marceloochoa.blogspot.com/ > http://marcelo.ochoa.googlepages.com/home > ______________ > Do you Know DBPrism? Look @ DB Prism's Web Site > http://www.dbprism.com.ar/index.html > More info? > Chapter 17 of the book "Programming the Oracle Database using Java & > Web Services" > http://www.amazon.com/gp/product/1555583296/ > Chapter 21 of the book "Professional XML Databases" - Wrox Press > http://www.amazon.com/gp/product/1861003587/ > Chapter 8 of the book "Oracle & Open Source" - O'Reilly > http://www.oreilly.com/catalog/oracleopen/ > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org