lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marcelo Ochoa" <marcelo.oc...@gmail.com>
Subject Re: Caused by: java.io.IOException: read past EOF on Slave
Date Tue, 30 Sep 2008 12:06:33 GMT
Michael:
  I have OJVMDirectory working with 2.4rc2 code base.
  I have refactored Output and Input streams classes according to
latest implementation of Buffered base classes and works OK.
  All of my tests suites runs OK also the bug in the 11g JITs compiler
is not reproducible.
  I'll test this new implementation of Input/Output stream classes if
they are compatible with 2.3 code base to release a new binary version
of Oracle Lucene integration with 2.3 distribution and once 2.4
release is in production I'll release another binary version too.
  Best regards, Marcelo.

On Mon, Sep 29, 2008 at 6:58 AM, Michael McCandless
<lucene@mikemccandless.com> wrote:
>
> Marcelo,
>
> Do you have any sense whether this is an issue with your integration (eg
> your Directory implementation that stores data in BLOB columns) vs something
> with Lucene 2.4?
>
> It seems odd to me that there would be a bug in your Directory
> implementation that 2.3 didn't tickle but 2.4 did.
>
> Can you dump the index, as stored in BLOB columns, out into the filesystem,
> and run CheckIndex on it?  (Or maybe run CheckIndex within Oracle).
>
> Mike
>
> Marcelo Ochoa wrote:
>
>> Mike:
>>  Actually there is more issues at first glance with OJVMDirectory
>> integration.
>>  Note this, I am creating an index with two simple documents:
>> INFO: Performing: SELECT /*+ DYNAMIC_SAMPLING(0) RULE NOCACHE(T1) */
>> T1.rowid,F1,extractValue(F2,'/emp/name/text()')
>> "name",extractValue(F2,'/emp/@id') "id" FROM LUCENE.T1 for update nowait
>> Sep 26, 2008 3:44:16 PM org.apache.lucene.indexer.TableIndexer index
>> FINE: Document<stored/uncompressed,indexed<rowid:AAARLCAAEAAAm2QAAA>
>> indexed,tokenized<F1:001> indexed,tokenized<name:ravi>
>> indexed,tokenized<id:01>>
>> Sep 26, 2008 3:44:16 PM org.apache.lucene.indexer.TableIndexer index
>> FINE: Document<stored/uncompressed,indexed<rowid:AAARLCAAEAAAm2QAAB>
>> indexed,tokenized<F1:003> indexed,tokenized<name:murthy>
>> indexed,tokenized<id:03>>
>> IW 10 [Root Thread]:   flush: segment=_0 docStoreSegment=_0
>> docStoreOffset=0 flushDocs=true flushDeletes=true flushDocStores=false
>> numDocs=2
>> numBufDelTerms=0
>> IW 10 [Root Thread]:   index before flush
>> IW 10 [Root Thread]: DW: flush postings as segment _0 numDocs=2
>> IW 10 [Root Thread]: DW:   oldRAMSize=111616 newFlushedSize=166
>> docs/MB=12,633.446 new/old=0.149%
>> IFD [Root Thread]: now checkpoint "segments_1" [1 segments ; isCommit =
>> false]
>> IW 10 [Root Thread]: LMP: findMerges: 1 segments
>> IW 10 [Root Thread]: LMP:   level -1.0 to 2.2741578: 1 segments
>> IW 10 [Root Thread]: CMS: now merge
>> IW 10 [Root Thread]: CMS:   index: _0:C2->_0
>> IW 10 [Root Thread]: CMS:   no more merges pending; now return
>> IW 10 [Root Thread]: now flush at close
>> IW 10 [Root Thread]:   flush: segment=null docStoreSegment=_0
>> docStoreOffset=2 flushDocs=false flushDeletes=true flushDocStores=true
>> numDocs=0 numBufDelTerms=0
>> IW 10 [Root Thread]:   index before flush _0:C2->_0
>> IW 10 [Root Thread]:   flush shared docStore segment _0
>> IW 10 [Root Thread]: DW: closeDocStore: 2 files to flush to segment _0
>> numDocs=2
>> IW 10 [Root Thread]: CMS: now merge
>> IW 10 [Root Thread]: CMS:   index: _0:C2->_0
>> IW 10 [Root Thread]: CMS:   no more merges pending; now return
>> IW 10 [Root Thread]: now call final commit()
>> IW 10 [Root Thread]: startCommit(): start sizeInBytes=0
>> IW 10 [Root Thread]: startCommit index=_0:C2->_0 changeCount=2
>> IW 10 [Root Thread]: now sync _0.fnm
>> IW 10 [Root Thread]: now sync _0.frq
>> IW 10 [Root Thread]: now sync _0.prx
>> IW 10 [Root Thread]: now sync _0.tis
>> IW 10 [Root Thread]: now sync _0.tii
>> IW 10 [Root Thread]: now sync _0.nrm
>> IW 10 [Root Thread]: now sync _0.fdx
>> IW 10 [Root Thread]: now sync _0.fdt
>> IW 10 [Root Thread]: done all syncs
>> IW 10 [Root Thread]: commit: pendingCommit != null
>> IFD [Root Thread]: now checkpoint "segments_2" [1 segments ; isCommit =
>> true]
>> IFD [Root Thread]: deleteCommits: now decRef commit "segments_1"
>> IFD [Root Thread]: delete "segments_1"
>> IW 10 [Root Thread]: commit: done
>> IW 10 [Root Thread]: at close: _0:C2->_0
>> Sep 26, 2008 3:44:16 PM org.apache.lucene.indexer.LuceneDomainIndex
>> ODCIIndexCreate
>> FINER: RETURN 0
>>
>> Index created.
>>
>>  And when I am trying to read the index I got:
>> INFO: Analyzer: org.apache.lucene.analysis.WhitespaceAnalyzer@f2164127
>> Sep 26, 2008 3:44:48 PM org.apache.lucene.indexer.LuceneDomainIndex
>> ODCIStart
>> INFO: qryStr: DESC(name:ravi)
>> Sep 26, 2008 3:44:48 PM org.apache.lucene.indexer.LuceneDomainIndex
>> ODCIStart
>> INFO: storing cachingFilter: -1378376940 and searcher: 781713581
>> qryStr: DESC(name:ravi)
>> Sep 26, 2008 3:44:48 PM org.apache.lucene.indexer.LuceneDomainIndex
>> getSort
>> INFO: using sort: <score>,<doc>
>> Exception in thread "Root Thread" java.lang.IndexOutOfBoundsException:
>> Index: 6, Size: 4
>>       at java.util.ArrayList.RangeCheck(ArrayList.java)
>>       at java.util.ArrayList.get(ArrayList.java)
>>       at org.apache.lucene.index.FieldInfos.fieldInfo(FieldInfos.java)
>>       at org.apache.lucene.index.FieldInfos.fieldName(FieldInfos.java)
>>       at org.apache.lucene.index.TermBuffer.read(TermBuffer.java)
>>       at
>> org.apache.lucene.index.SegmentTermEnum.next(SegmentTermEnum.java)
>>       at
>> org.apache.lucene.index.SegmentTermEnum.scanTo(SegmentTermEnum.java)
>>       at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java)
>>       at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java)
>>       at org.apache.lucene.index.SegmentReader.docFreq(SegmentReader.java)
>>       at
>> org.apache.lucene.search.IndexSearcher.docFreq(IndexSearcher.java)
>>       at org.apache.lucene.search.Similarity.idf(Similarity.java)
>>       at
>> org.apache.lucene.search.TermQuery$TermWeight.<init>(TermQuery.java)
>>       at org.apache.lucene.search.TermQuery.createWeight(TermQuery.java)
>>       at org.apache.lucene.search.Query.weight(Query.java)
>>       at org.apache.lucene.search.Hits.<init>(Hits.java:85)
>>       at org.apache.lucene.search.Searcher.search(Searcher.java)
>>       at
>> org.apache.lucene.indexer.LuceneDomainIndex.ODCIStart(LuceneDomainIndex.java)
>>
>>  Which definetly means that something is not well saved at OJVM
>> directory BLOB storage :(
>>  This are my files:
>> SQL> select file_size,name from it1$t;
>>
>> FILE_SIZE NAME
>> ---------- ------------------------------
>>       10 parameters
>>        1 updateCount
>>       28 segments_1
>>       20 segments.gen
>>        8 _0.frq
>>        8 _0.prx
>>      103 _0.tis
>>       35 _0.tii
>>       12 _0.nrm
>>       22 _0.fnm
>>       48 _0.fdt
>>       20 _0.fdx
>>       62 segments_2
>>  I'll add some debugging information at my classes which save/load
>> buffers to see how many calls and which arguments are used.
>>  Marcelo.
>>
>> On Fri, Sep 26, 2008 at 1:41 PM, Michael McCandless
>> <lucene@mikemccandless.com> wrote:
>>>
>>> This one looks spooky!
>>>
>>> Is it easily repeated?  If you could print out which 2 terms you had
>>> tried
>>> to delete, and then zip up the index just before deleting those docs
>>> (after
>>> closing the writer) and send to me, I can try to understand what's wrong
>>> with the index.  It looks as if the *.tis file for one of the segments is
>>> truncated.
>>>
>>> If you capture the series of add/update/delete documents, can you get a
>>> standalone Java test to show this?
>>>
>>> Does this test create an entirely new index?
>>>
>>> We did change the index format in 2.4 to use "true" UTF8 encoding for all
>>> text content; not sure that this applies here (to BufferedIndexReader
>>> it's
>>> all bytes) but it may.
>>>
>>> BufferedIndexReader in general can do random IO, especially when reading
>>> the
>>> term dict file (*.tis), when you
>>>
>>> Mike
>>>
>>> Marcelo Ochoa wrote:
>>>
>>>> Michael:
>>>> I just start testing 2.4rc2 running inside OJVM.
>>>> I found a similar stack trace during indexing:
>>>> IW 3 [Root Thread]:   flush: segment=_3 docStoreSegment=_3
>>>> docStoreOffset=0 flushDocs=true flushDeletes=true flushDocStores=false
>>>> numDocs=2 numBufDelTerms=2
>>>> IW 3 [Root Thread]:   index before flush _1:C2->_1 _2:C2->_2
>>>> IW 3 [Root Thread]: DW: flush postings as segment _3 numDocs=2
>>>> IW 3 [Root Thread]: DW:   oldRAMSize=111616 newFlushedSize=264
>>>> docs/MB=7,943.758 new/old=0.237%
>>>> IW 3 [Root Thread]: DW: apply 2 buffered deleted terms and 0 deleted
>>>> docIDs and 0 deleted queries on 3 segments.
>>>> IW 3 [Root Thread]: hit exception flushing deletes
>>>> Exception in thread "Root Thread" java.io.IOException: read past EOF
>>>>     at
>>>>
>>>> org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java)
>>>>     at
>>>>
>>>> org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java)
>>>>     at
>>>>
>>>> org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java)
>>>>     at org.apache.lucene.index.TermBuffer.read(TermBuffer.java)
>>>>     at
>>>> org.apache.lucene.index.SegmentTermEnum.next(SegmentTermEnum.java)
>>>>     at
>>>> org.apache.lucene.index.SegmentTermEnum.scanTo(SegmentTermEnum.java)
>>>>     at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java)
>>>>     at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java)
>>>>     at
>>>> org.apache.lucene.index.SegmentTermDocs.seek(SegmentTermDocs.java)
>>>>     at org.apache.lucene.index.IndexReader.termDocs(IndexReader.java)
>>>>     at
>>>>
>>>> org.apache.lucene.index.DocumentsWriter.applyDeletes(DocumentsWriter.java)
>>>>     at
>>>>
>>>> org.apache.lucene.index.DocumentsWriter.applyDeletes(DocumentsWriter.java:918)
>>>>     at
>>>> org.apache.lucene.index.IndexWriter.applyDeletes(IndexWriter.java)
>>>>     at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java)
>>>>     at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java)
>>>>     at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java)
>>>>     at
>>>>
>>>> org.apache.lucene.indexer.LuceneDomainIndex.sync(LuceneDomainIndex.java:1308)
>>>>
>>>> I'll reinstall with a full debug info to see all line numbers in
>>>> Lucene java code.
>>>> Is there a list of semantic changes at BufferedIndeInput code?
>>>> I mean it do sequential or random writes for example.
>>>> But anyway, I just compiled with latest code and ran my test suites,
>>>> I'll investigate the problem a bit more.
>>>> Best regards, Marcelo.
>>>>
>>>> On Fri, Sep 26, 2008 at 7:32 AM, Michael McCandless
>>>> <lucene@mikemccandless.com> wrote:
>>>>>
>>>>> Can you describe the sequence of steps that your replication process
>>>>> goes
>>>>> through?
>>>>>
>>>>> Also, which filesystem is the index being accessed through?
>>>>>
>>>>> Mike
>>>>>
>>>>> rahul_k123 wrote:
>>>>>
>>>>>>
>>>>>> First of all, thanks to all the people who helped me in getting the
>>>>>> lucene
>>>>>> replication setup working and right now its live in our production
:-)
>>>>>>
>>>>>> Everything working fine, except that i am seeing some exceptions
on
>>>>>> slaves.
>>>>>>
>>>>>> The following is the one which is occuring more often on slaves
>>>>>>
>>>>>> at
>>>>>>
>>>>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>>>>>>   at
>>>>>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>>>>>>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>>>>>>   at
>>>>>>
>>>>>>
>>>>>>
>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>>>>>>   at
>>>>>>
>>>>>>
>>>>>>
>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>>>>>>   at java.lang.Thread.run(Thread.java:619)
>>>>>> Caused by: com.IndexingException: [SYSTEM_ERROR] Cannot access index
>>>>>> [data_dir/index]: [read past EOF]
>>>>>>   at
>>>>>>
>>>>>>
>>>>>>
>>>>>> com.lucene.LuceneSearchService.getSearchResults(LuceneSearchService.java:964)
>>>>>>   ... 12 more
>>>>>> Caused by: java.io.IOException: read past EOF
>>>>>>   at
>>>>>>
>>>>>>
>>>>>>
>>>>>> org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:146)
>>>>>>   at
>>>>>>
>>>>>>
>>>>>>
>>>>>> org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:38)
>>>>>>   at org.apache.lucene.store.IndexInput.readInt(IndexInput.java:66)
>>>>>>   at org.apache.lucene.store.IndexInput.readLong(IndexInput.java:89)
>>>>>>   at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:147)
>>>>>>   at
>>>>>> org.apache.lucene.index.SegmentReader.document(SegmentReader.java:659)
>>>>>>   at
>>>>>>
>>>>>>
>>>>>>
>>>>>> org.apache.lucene.index.MultiSegmentReader.document(MultiSegmentReader.java:257)
>>>>>>   at
>>>>>> org.apache.lucene.index.IndexReader.document(IndexReader.java:525)
>>>>>>
>>>>>> and the second one is
>>>>>>
>>>>>> at
>>>>>>
>>>>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
>>>>>>   at
>>>>>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>>>>>>   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>>>>>>   at
>>>>>>
>>>>>>
>>>>>>
>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
>>>>>>   at
>>>>>>
>>>>>>
>>>>>>
>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>>>>>>   at java.lang.Thread.run(Thread.java:619)
>>>>>> Caused by: java.lang.IllegalArgumentException: attempt to access
a
>>>>>> deleted
>>>>>> document
>>>>>>   at
>>>>>> org.apache.lucene.index.SegmentReader.document(SegmentReader.java:657)
>>>>>>   at
>>>>>>
>>>>>>
>>>>>>
>>>>>> org.apache.lucene.index.MultiSegmentReader.document(MultiSegmentReader.java:257)
>>>>>>   at
>>>>>> org.apache.lucene.index.IndexReader.document(IndexReader.java:525)
>>>>>> This is on master index .
>>>>>>
>>>>>>
>>>>>>
>>>>>> Any help is appreciated
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>> --
>>>>>> View this message in context:
>>>>>>
>>>>>>
>>>>>> http://www.nabble.com/Caused-by%3A-java.io.IOException%3A-read-past-EOF-on-Slave-tp19682684p19682684.html
>>>>>> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>>>>>>
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Marcelo F. Ochoa
>>>> http://marceloochoa.blogspot.com/
>>>> http://marcelo.ochoa.googlepages.com/home
>>>> ______________
>>>> Do you Know DBPrism? Look @ DB Prism's Web Site
>>>> http://www.dbprism.com.ar/index.html
>>>> More info?
>>>> Chapter 17 of the book "Programming the Oracle Database using Java &
>>>> Web Services"
>>>> http://www.amazon.com/gp/product/1555583296/
>>>> Chapter 21 of the book "Professional XML Databases" - Wrox Press
>>>> http://www.amazon.com/gp/product/1861003587/
>>>> Chapter 8 of the book "Oracle & Open Source" - O'Reilly
>>>> http://www.oreilly.com/catalog/oracleopen/
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>
>>
>>
>> --
>> Marcelo F. Ochoa
>> http://marceloochoa.blogspot.com/
>> http://marcelo.ochoa.googlepages.com/home
>> ______________
>> Want to integrate Lucene and Oracle?
>>
>> http://marceloochoa.blogspot.com/2007/09/running-lucene-inside-your-oracle-jvm.html
>> Is Oracle 11g REST ready?
>> http://marceloochoa.blogspot.com/2008/02/is-oracle-11g-rest-ready.html
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>



-- 
Marcelo F. Ochoa
http://marceloochoa.blogspot.com/
http://marcelo.ochoa.googlepages.com/home
______________
Want to integrate Lucene and Oracle?
http://marceloochoa.blogspot.com/2007/09/running-lucene-inside-your-oracle-jvm.html
Is Oracle 11g REST ready?
http://marceloochoa.blogspot.com/2008/02/is-oracle-11g-rest-ready.html

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message