lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marcelo Ochoa" <marcelo.oc...@gmail.com>
Subject Re: Caused by: java.io.IOException: read past EOF on Slave
Date Fri, 26 Sep 2008 18:53:44 GMT
Mike:
  Actually there is more issues at first glance with OJVMDirectory integration.
  Note this, I am creating an index with two simple documents:
INFO: Performing: SELECT /*+ DYNAMIC_SAMPLING(0) RULE NOCACHE(T1) */
T1.rowid,F1,extractValue(F2,'/emp/name/text()')
"name",extractValue(F2,'/emp/@id') "id" FROM LUCENE.T1 for update nowait
Sep 26, 2008 3:44:16 PM org.apache.lucene.indexer.TableIndexer index
FINE: Document<stored/uncompressed,indexed<rowid:AAARLCAAEAAAm2QAAA>
indexed,tokenized<F1:001> indexed,tokenized<name:ravi>
indexed,tokenized<id:01>>
Sep 26, 2008 3:44:16 PM org.apache.lucene.indexer.TableIndexer index
FINE: Document<stored/uncompressed,indexed<rowid:AAARLCAAEAAAm2QAAB>
indexed,tokenized<F1:003> indexed,tokenized<name:murthy>
indexed,tokenized<id:03>>
IW 10 [Root Thread]:   flush: segment=_0 docStoreSegment=_0
docStoreOffset=0 flushDocs=true flushDeletes=true flushDocStores=false
numDocs=2
numBufDelTerms=0
IW 10 [Root Thread]:   index before flush
IW 10 [Root Thread]: DW: flush postings as segment _0 numDocs=2
IW 10 [Root Thread]: DW:   oldRAMSize=111616 newFlushedSize=166
docs/MB=12,633.446 new/old=0.149%
IFD [Root Thread]: now checkpoint "segments_1" [1 segments ; isCommit = false]
IW 10 [Root Thread]: LMP: findMerges: 1 segments
IW 10 [Root Thread]: LMP:   level -1.0 to 2.2741578: 1 segments
IW 10 [Root Thread]: CMS: now merge
IW 10 [Root Thread]: CMS:   index: _0:C2->_0
IW 10 [Root Thread]: CMS:   no more merges pending; now return
IW 10 [Root Thread]: now flush at close
IW 10 [Root Thread]:   flush: segment=null docStoreSegment=_0
docStoreOffset=2 flushDocs=false flushDeletes=true flushDocStores=true
numDocs=0 numBufDelTerms=0
IW 10 [Root Thread]:   index before flush _0:C2->_0
IW 10 [Root Thread]:   flush shared docStore segment _0
IW 10 [Root Thread]: DW: closeDocStore: 2 files to flush to segment _0 numDocs=2
IW 10 [Root Thread]: CMS: now merge
IW 10 [Root Thread]: CMS:   index: _0:C2->_0
IW 10 [Root Thread]: CMS:   no more merges pending; now return
IW 10 [Root Thread]: now call final commit()
IW 10 [Root Thread]: startCommit(): start sizeInBytes=0
IW 10 [Root Thread]: startCommit index=_0:C2->_0 changeCount=2
IW 10 [Root Thread]: now sync _0.fnm
IW 10 [Root Thread]: now sync _0.frq
IW 10 [Root Thread]: now sync _0.prx
IW 10 [Root Thread]: now sync _0.tis
IW 10 [Root Thread]: now sync _0.tii
IW 10 [Root Thread]: now sync _0.nrm
IW 10 [Root Thread]: now sync _0.fdx
IW 10 [Root Thread]: now sync _0.fdt
IW 10 [Root Thread]: done all syncs
IW 10 [Root Thread]: commit: pendingCommit != null
IFD [Root Thread]: now checkpoint "segments_2" [1 segments ; isCommit = true]
IFD [Root Thread]: deleteCommits: now decRef commit "segments_1"
IFD [Root Thread]: delete "segments_1"
IW 10 [Root Thread]: commit: done
IW 10 [Root Thread]: at close: _0:C2->_0
Sep 26, 2008 3:44:16 PM org.apache.lucene.indexer.LuceneDomainIndex
ODCIIndexCreate
FINER: RETURN 0

Index created.

  And when I am trying to read the index I got:
INFO: Analyzer: org.apache.lucene.analysis.WhitespaceAnalyzer@f2164127
Sep 26, 2008 3:44:48 PM org.apache.lucene.indexer.LuceneDomainIndex ODCIStart
INFO: qryStr: DESC(name:ravi)
Sep 26, 2008 3:44:48 PM org.apache.lucene.indexer.LuceneDomainIndex ODCIStart
INFO: storing cachingFilter: -1378376940 and searcher: 781713581
qryStr: DESC(name:ravi)
Sep 26, 2008 3:44:48 PM org.apache.lucene.indexer.LuceneDomainIndex getSort
INFO: using sort: <score>,<doc>
Exception in thread "Root Thread" java.lang.IndexOutOfBoundsException:
Index: 6, Size: 4
        at java.util.ArrayList.RangeCheck(ArrayList.java)
        at java.util.ArrayList.get(ArrayList.java)
        at org.apache.lucene.index.FieldInfos.fieldInfo(FieldInfos.java)
        at org.apache.lucene.index.FieldInfos.fieldName(FieldInfos.java)
        at org.apache.lucene.index.TermBuffer.read(TermBuffer.java)
        at org.apache.lucene.index.SegmentTermEnum.next(SegmentTermEnum.java)
        at org.apache.lucene.index.SegmentTermEnum.scanTo(SegmentTermEnum.java)
        at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java)
        at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java)
        at org.apache.lucene.index.SegmentReader.docFreq(SegmentReader.java)
        at org.apache.lucene.search.IndexSearcher.docFreq(IndexSearcher.java)
        at org.apache.lucene.search.Similarity.idf(Similarity.java)
        at org.apache.lucene.search.TermQuery$TermWeight.<init>(TermQuery.java)
        at org.apache.lucene.search.TermQuery.createWeight(TermQuery.java)
        at org.apache.lucene.search.Query.weight(Query.java)
        at org.apache.lucene.search.Hits.<init>(Hits.java:85)
        at org.apache.lucene.search.Searcher.search(Searcher.java)
        at org.apache.lucene.indexer.LuceneDomainIndex.ODCIStart(LuceneDomainIndex.java)

  Which definetly means that something is not well saved at OJVM
directory BLOB storage :(
  This are my files:
SQL> select file_size,name from it1$t;

 FILE_SIZE NAME
---------- ------------------------------
        10 parameters
         1 updateCount
        28 segments_1
        20 segments.gen
         8 _0.frq
         8 _0.prx
       103 _0.tis
        35 _0.tii
        12 _0.nrm
        22 _0.fnm
        48 _0.fdt
        20 _0.fdx
        62 segments_2
  I'll add some debugging information at my classes which save/load
buffers to see how many calls and which arguments are used.
  Marcelo.

On Fri, Sep 26, 2008 at 1:41 PM, Michael McCandless
<lucene@mikemccandless.com> wrote:
>
> This one looks spooky!
>
> Is it easily repeated?  If you could print out which 2 terms you had tried
> to delete, and then zip up the index just before deleting those docs (after
> closing the writer) and send to me, I can try to understand what's wrong
> with the index.  It looks as if the *.tis file for one of the segments is
> truncated.
>
> If you capture the series of add/update/delete documents, can you get a
> standalone Java test to show this?
>
> Does this test create an entirely new index?
>
> We did change the index format in 2.4 to use "true" UTF8 encoding for all
> text content; not sure that this applies here (to BufferedIndexReader it's
> all bytes) but it may.
>
> BufferedIndexReader in general can do random IO, especially when reading the
> term dict file (*.tis), when you
>
> Mike
>
> Marcelo Ochoa wrote:
>
>> Michael:
>>  I just start testing 2.4rc2 running inside OJVM.
>> I found a similar stack trace during indexing:
>> IW 3 [Root Thread]:   flush: segment=_3 docStoreSegment=_3
>> docStoreOffset=0 flushDocs=true flushDeletes=true flushDocStores=false
>> numDocs=2 numBufDelTerms=2
>> IW 3 [Root Thread]:   index before flush _1:C2->_1 _2:C2->_2
>> IW 3 [Root Thread]: DW: flush postings as segment _3 numDocs=2
>> IW 3 [Root Thread]: DW:   oldRAMSize=111616 newFlushedSize=264
>> docs/MB=7,943.758 new/old=0.237%
>> IW 3 [Root Thread]: DW: apply 2 buffered deleted terms and 0 deleted
>> docIDs and 0 deleted queries on 3 segments.
>> IW 3 [Root Thread]: hit exception flushing deletes
>> Exception in thread "Root Thread" java.io.IOException: read past EOF
>>       at
>> org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java)
>>       at
>> org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java)
>>       at
>> org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java)
>>       at org.apache.lucene.index.TermBuffer.read(TermBuffer.java)
>>       at
>> org.apache.lucene.index.SegmentTermEnum.next(SegmentTermEnum.java)
>>       at
>> org.apache.lucene.index.SegmentTermEnum.scanTo(SegmentTermEnum.java)
>>       at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java)
>>       at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java)
>>       at
>> org.apache.lucene.index.SegmentTermDocs.seek(SegmentTermDocs.java)
>>       at org.apache.lucene.index.IndexReader.termDocs(IndexReader.java)
>>       at
>> org.apache.lucene.index.DocumentsWriter.applyDeletes(DocumentsWriter.java)
>>       at
>> org.apache.lucene.index.DocumentsWriter.applyDeletes(DocumentsWriter.java:918)
>>       at
>> org.apache.lucene.index.IndexWriter.applyDeletes(IndexWriter.java)
>>       at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java)
>>       at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java)
>>       at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java)
>>       at
>> org.apache.lucene.indexer.LuceneDomainIndex.sync(LuceneDomainIndex.java:1308)
>>
>>  I'll reinstall with a full debug info to see all line numbers in
>> Lucene java code.
>>  Is there a list of semantic changes at BufferedIndeInput code?
>>  I mean it do sequential or random writes for example.
>>  But anyway, I just compiled with latest code and ran my test suites,
>> I'll investigate the problem a bit more.
>>  Best regards, Marcelo.
>>
>> On Fri, Sep 26, 2008 at 7:32 AM, Michael McCandless
>> <lucene@mikemccandless.com> wrote:
>>>
>>> Can you describe the sequence of steps that your replication process goes
>>> through?
>>>
>>> Also, which filesystem is the index being accessed through?
>>>
>>> Mike
>>>
>>> rahul_k123 wrote:
>>>
>>>>
>>>> First of all, thanks to all the people who helped me in getting the
>>>> lucene
>>>> replication setup working and right now its live in our production :-)
>>>>
>>>> Everything working fine, except that i am seeing some exceptions on
>>>> slaves.
>>>>
>>>> The following is the one which is occuring more often on slaves
>>>>
>>>> at
>>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>>>>     at
>>>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>>>>     at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>>>>     at
>>>>
>>>>
>>>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>>>>     at
>>>>
>>>>
>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>>>>     at java.lang.Thread.run(Thread.java:619)
>>>> Caused by: com.IndexingException: [SYSTEM_ERROR] Cannot access index
>>>> [data_dir/index]: [read past EOF]
>>>>     at
>>>>
>>>>
>>>> com.lucene.LuceneSearchService.getSearchResults(LuceneSearchService.java:964)
>>>>     ... 12 more
>>>> Caused by: java.io.IOException: read past EOF
>>>>     at
>>>>
>>>>
>>>> org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:146)
>>>>     at
>>>>
>>>>
>>>> org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:38)
>>>>     at org.apache.lucene.store.IndexInput.readInt(IndexInput.java:66)
>>>>     at org.apache.lucene.store.IndexInput.readLong(IndexInput.java:89)
>>>>     at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:147)
>>>>     at
>>>> org.apache.lucene.index.SegmentReader.document(SegmentReader.java:659)
>>>>     at
>>>>
>>>>
>>>> org.apache.lucene.index.MultiSegmentReader.document(MultiSegmentReader.java:257)
>>>>     at
>>>> org.apache.lucene.index.IndexReader.document(IndexReader.java:525)
>>>>
>>>> and the second one is
>>>>
>>>> at
>>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>>>>     at
>>>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>>>>     at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>>>>     at
>>>>
>>>>
>>>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>>>>     at
>>>>
>>>>
>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>>>>     at java.lang.Thread.run(Thread.java:619)
>>>> Caused by: java.lang.IllegalArgumentException: attempt to access a
>>>> deleted
>>>> document
>>>>     at
>>>> org.apache.lucene.index.SegmentReader.document(SegmentReader.java:657)
>>>>     at
>>>>
>>>>
>>>> org.apache.lucene.index.MultiSegmentReader.document(MultiSegmentReader.java:257)
>>>>     at
>>>> org.apache.lucene.index.IndexReader.document(IndexReader.java:525)
>>>> This is on master index .
>>>>
>>>>
>>>>
>>>> Any help is appreciated
>>>>
>>>> Thanks.
>>>>
>>>> --
>>>> View this message in context:
>>>>
>>>> http://www.nabble.com/Caused-by%3A-java.io.IOException%3A-read-past-EOF-on-Slave-tp19682684p19682684.html
>>>> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>
>>
>>
>> --
>> Marcelo F. Ochoa
>> http://marceloochoa.blogspot.com/
>> http://marcelo.ochoa.googlepages.com/home
>> ______________
>> Do you Know DBPrism? Look @ DB Prism's Web Site
>> http://www.dbprism.com.ar/index.html
>> More info?
>> Chapter 17 of the book "Programming the Oracle Database using Java &
>> Web Services"
>> http://www.amazon.com/gp/product/1555583296/
>> Chapter 21 of the book "Professional XML Databases" - Wrox Press
>> http://www.amazon.com/gp/product/1861003587/
>> Chapter 8 of the book "Oracle & Open Source" - O'Reilly
>> http://www.oreilly.com/catalog/oracleopen/
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>



-- 
Marcelo F. Ochoa
http://marceloochoa.blogspot.com/
http://marcelo.ochoa.googlepages.com/home
______________
Want to integrate Lucene and Oracle?
http://marceloochoa.blogspot.com/2007/09/running-lucene-inside-your-oracle-jvm.html
Is Oracle 11g REST ready?
http://marceloochoa.blogspot.com/2008/02/is-oracle-11g-rest-ready.html

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message