lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jimi HullegÄrd <jimi.hulleg...@mogul.com>
Subject RE: [SPAM] - Re: Caused by: java.io.IOException: read past EOF on Slave - Found word(s) list error in the Text body
Date Mon, 29 Sep 2008 10:52:02 GMT
Is there a specific reason that you write your text in this way? I mean, indentions instead
of line breaks?  It makes it very hard to read, if you ask me.

Just my 2 cents. :)

/Jimi

mogul | jimi hullegÄrd | system developer | hudiksvallsgatan 4, 113 30 stockholm sweden |
+46 8 506 66 172 | +46 765 27 19 55 | jimi.hullegard@mogul.com | www.mogul.com


> -----Original Message-----
> From: Marcelo Ochoa [mailto:marcelo.ochoa@gmail.com]
> Sent: den 26 september 2008 20:54
> To: java-user@lucene.apache.org
> Subject: [SPAM] - Re: Caused by: java.io.IOException: read
> past EOF on Slave - Found word(s) list error in the Text body
>
> Mike:
>   Actually there is more issues at first glance with
> OJVMDirectory integration.
>   Note this, I am creating an index with two simple documents:
> INFO: Performing: SELECT /*+ DYNAMIC_SAMPLING(0) RULE NOCACHE(T1) */
> T1.rowid,F1,extractValue(F2,'/emp/name/text()')
> "name",extractValue(F2,'/emp/@id') "id" FROM LUCENE.T1 for
> update nowait
> Sep 26, 2008 3:44:16 PM org.apache.lucene.indexer.TableIndexer index
> FINE: Document<stored/uncompressed,indexed<rowid:AAARLCAAEAAAm2QAAA>
> indexed,tokenized<F1:001> indexed,tokenized<name:ravi>
> indexed,tokenized<id:01>>
> Sep 26, 2008 3:44:16 PM org.apache.lucene.indexer.TableIndexer index
> FINE: Document<stored/uncompressed,indexed<rowid:AAARLCAAEAAAm2QAAB>
> indexed,tokenized<F1:003> indexed,tokenized<name:murthy>
> indexed,tokenized<id:03>>
> IW 10 [Root Thread]:   flush: segment=_0 docStoreSegment=_0
> docStoreOffset=0 flushDocs=true flushDeletes=true flushDocStores=false
> numDocs=2
> numBufDelTerms=0
> IW 10 [Root Thread]:   index before flush
> IW 10 [Root Thread]: DW: flush postings as segment _0 numDocs=2
> IW 10 [Root Thread]: DW:   oldRAMSize=111616 newFlushedSize=166
> docs/MB=12,633.446 new/old=0.149%
> IFD [Root Thread]: now checkpoint "segments_1" [1 segments ;
> isCommit = false]
> IW 10 [Root Thread]: LMP: findMerges: 1 segments
> IW 10 [Root Thread]: LMP:   level -1.0 to 2.2741578: 1 segments
> IW 10 [Root Thread]: CMS: now merge
> IW 10 [Root Thread]: CMS:   index: _0:C2->_0
> IW 10 [Root Thread]: CMS:   no more merges pending; now return
> IW 10 [Root Thread]: now flush at close
> IW 10 [Root Thread]:   flush: segment=null docStoreSegment=_0
> docStoreOffset=2 flushDocs=false flushDeletes=true flushDocStores=true
> numDocs=0 numBufDelTerms=0
> IW 10 [Root Thread]:   index before flush _0:C2->_0
> IW 10 [Root Thread]:   flush shared docStore segment _0
> IW 10 [Root Thread]: DW: closeDocStore: 2 files to flush to
> segment _0 numDocs=2
> IW 10 [Root Thread]: CMS: now merge
> IW 10 [Root Thread]: CMS:   index: _0:C2->_0
> IW 10 [Root Thread]: CMS:   no more merges pending; now return
> IW 10 [Root Thread]: now call final commit()
> IW 10 [Root Thread]: startCommit(): start sizeInBytes=0
> IW 10 [Root Thread]: startCommit index=_0:C2->_0 changeCount=2
> IW 10 [Root Thread]: now sync _0.fnm
> IW 10 [Root Thread]: now sync _0.frq
> IW 10 [Root Thread]: now sync _0.prx
> IW 10 [Root Thread]: now sync _0.tis
> IW 10 [Root Thread]: now sync _0.tii
> IW 10 [Root Thread]: now sync _0.nrm
> IW 10 [Root Thread]: now sync _0.fdx
> IW 10 [Root Thread]: now sync _0.fdt
> IW 10 [Root Thread]: done all syncs
> IW 10 [Root Thread]: commit: pendingCommit != null
> IFD [Root Thread]: now checkpoint "segments_2" [1 segments ;
> isCommit = true]
> IFD [Root Thread]: deleteCommits: now decRef commit "segments_1"
> IFD [Root Thread]: delete "segments_1"
> IW 10 [Root Thread]: commit: done
> IW 10 [Root Thread]: at close: _0:C2->_0
> Sep 26, 2008 3:44:16 PM org.apache.lucene.indexer.LuceneDomainIndex
> ODCIIndexCreate
> FINER: RETURN 0
>
> Index created.
>
>   And when I am trying to read the index I got:
> INFO: Analyzer: org.apache.lucene.analysis.WhitespaceAnalyzer@f2164127
> Sep 26, 2008 3:44:48 PM
> org.apache.lucene.indexer.LuceneDomainIndex ODCIStart
> INFO: qryStr: DESC(name:ravi)
> Sep 26, 2008 3:44:48 PM
> org.apache.lucene.indexer.LuceneDomainIndex ODCIStart
> INFO: storing cachingFilter: -1378376940 and searcher: 781713581
> qryStr: DESC(name:ravi)
> Sep 26, 2008 3:44:48 PM
> org.apache.lucene.indexer.LuceneDomainIndex getSort
> INFO: using sort: <score>,<doc>
> Exception in thread "Root Thread" java.lang.IndexOutOfBoundsException:
> Index: 6, Size: 4
>         at java.util.ArrayList.RangeCheck(ArrayList.java)
>         at java.util.ArrayList.get(ArrayList.java)
>         at
> org.apache.lucene.index.FieldInfos.fieldInfo(FieldInfos.java)
>         at
> org.apache.lucene.index.FieldInfos.fieldName(FieldInfos.java)
>         at org.apache.lucene.index.TermBuffer.read(TermBuffer.java)
>         at
> org.apache.lucene.index.SegmentTermEnum.next(SegmentTermEnum.java)
>         at
> org.apache.lucene.index.SegmentTermEnum.scanTo(SegmentTermEnum.java)
>         at
> org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java)
>         at
> org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java)
>         at
> org.apache.lucene.index.SegmentReader.docFreq(SegmentReader.java)
>         at
> org.apache.lucene.search.IndexSearcher.docFreq(IndexSearcher.java)
>         at org.apache.lucene.search.Similarity.idf(Similarity.java)
>         at
> org.apache.lucene.search.TermQuery$TermWeight.<init>(TermQuery.java)
>         at
> org.apache.lucene.search.TermQuery.createWeight(TermQuery.java)
>         at org.apache.lucene.search.Query.weight(Query.java)
>         at org.apache.lucene.search.Hits.<init>(Hits.java:85)
>         at org.apache.lucene.search.Searcher.search(Searcher.java)
>         at
> org.apache.lucene.indexer.LuceneDomainIndex.ODCIStart(LuceneDo
> mainIndex.java)
>
>   Which definetly means that something is not well saved at OJVM
> directory BLOB storage :(
>   This are my files:
> SQL> select file_size,name from it1$t;
>
>  FILE_SIZE NAME
> ---------- ------------------------------
>         10 parameters
>          1 updateCount
>         28 segments_1
>         20 segments.gen
>          8 _0.frq
>          8 _0.prx
>        103 _0.tis
>         35 _0.tii
>         12 _0.nrm
>         22 _0.fnm
>         48 _0.fdt
>         20 _0.fdx
>         62 segments_2
>   I'll add some debugging information at my classes which save/load
> buffers to see how many calls and which arguments are used.
>   Marcelo.
>
> On Fri, Sep 26, 2008 at 1:41 PM, Michael McCandless
> <lucene@mikemccandless.com> wrote:
> >
> > This one looks spooky!
> >
> > Is it easily repeated?  If you could print out which 2
> terms you had tried
> > to delete, and then zip up the index just before deleting
> those docs (after
> > closing the writer) and send to me, I can try to understand
> what's wrong
> > with the index.  It looks as if the *.tis file for one of
> the segments is
> > truncated.
> >
> > If you capture the series of add/update/delete documents,
> can you get a
> > standalone Java test to show this?
> >
> > Does this test create an entirely new index?
> >
> > We did change the index format in 2.4 to use "true" UTF8
> encoding for all
> > text content; not sure that this applies here (to
> BufferedIndexReader it's
> > all bytes) but it may.
> >
> > BufferedIndexReader in general can do random IO, especially
> when reading the
> > term dict file (*.tis), when you
> >
> > Mike
> >
> > Marcelo Ochoa wrote:
> >
> >> Michael:
> >>  I just start testing 2.4rc2 running inside OJVM.
> >> I found a similar stack trace during indexing:
> >> IW 3 [Root Thread]:   flush: segment=_3 docStoreSegment=_3
> >> docStoreOffset=0 flushDocs=true flushDeletes=true
> flushDocStores=false
> >> numDocs=2 numBufDelTerms=2
> >> IW 3 [Root Thread]:   index before flush _1:C2->_1 _2:C2->_2
> >> IW 3 [Root Thread]: DW: flush postings as segment _3 numDocs=2
> >> IW 3 [Root Thread]: DW:   oldRAMSize=111616 newFlushedSize=264
> >> docs/MB=7,943.758 new/old=0.237%
> >> IW 3 [Root Thread]: DW: apply 2 buffered deleted terms and
> 0 deleted
> >> docIDs and 0 deleted queries on 3 segments.
> >> IW 3 [Root Thread]: hit exception flushing deletes
> >> Exception in thread "Root Thread" java.io.IOException:
> read past EOF
> >>       at
> >>
> org.apache.lucene.store.BufferedIndexInput.refill(BufferedInde
> xInput.java)
> >>       at
> >>
> org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedI
> ndexInput.java)
> >>       at
> >>
> org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedI
> ndexInput.java)
> >>       at org.apache.lucene.index.TermBuffer.read(TermBuffer.java)
> >>       at
> >> org.apache.lucene.index.SegmentTermEnum.next(SegmentTermEnum.java)
> >>       at
> >>
> org.apache.lucene.index.SegmentTermEnum.scanTo(SegmentTermEnum.java)
> >>       at
> org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java)
> >>       at
> org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java)
> >>       at
> >> org.apache.lucene.index.SegmentTermDocs.seek(SegmentTermDocs.java)
> >>       at
> org.apache.lucene.index.IndexReader.termDocs(IndexReader.java)
> >>       at
> >>
> org.apache.lucene.index.DocumentsWriter.applyDeletes(Documents
> Writer.java)
> >>       at
> >>
> org.apache.lucene.index.DocumentsWriter.applyDeletes(Documents
> Writer.java:918)
> >>       at
> >> org.apache.lucene.index.IndexWriter.applyDeletes(IndexWriter.java)
> >>       at
> org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java)
> >>       at
> org.apache.lucene.index.IndexWriter.flush(IndexWriter.java)
> >>       at
> org.apache.lucene.index.IndexWriter.flush(IndexWriter.java)
> >>       at
> >>
> org.apache.lucene.indexer.LuceneDomainIndex.sync(LuceneDomainI
> ndex.java:1308)
> >>
> >>  I'll reinstall with a full debug info to see all line numbers in
> >> Lucene java code.
> >>  Is there a list of semantic changes at BufferedIndeInput code?
> >>  I mean it do sequential or random writes for example.
> >>  But anyway, I just compiled with latest code and ran my
> test suites,
> >> I'll investigate the problem a bit more.
> >>  Best regards, Marcelo.
> >>
> >> On Fri, Sep 26, 2008 at 7:32 AM, Michael McCandless
> >> <lucene@mikemccandless.com> wrote:
> >>>
> >>> Can you describe the sequence of steps that your
> replication process goes
> >>> through?
> >>>
> >>> Also, which filesystem is the index being accessed through?
> >>>
> >>> Mike
> >>>
> >>> rahul_k123 wrote:
> >>>
> >>>>
> >>>> First of all, thanks to all the people who helped me in
> getting the
> >>>> lucene
> >>>> replication setup working and right now its live in our
> production :-)
> >>>>
> >>>> Everything working fine, except that i am seeing some
> exceptions on
> >>>> slaves.
> >>>>
> >>>> The following is the one which is occuring more often on slaves
> >>>>
> >>>> at
> >>>>
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.
> java:441)
> >>>>     at
> >>>>
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> >>>>     at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> >>>>     at
> >>>>
> >>>>
> >>>>
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadP
> oolExecutor.java:885)
> >>>>     at
> >>>>
> >>>>
> >>>>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolE
> xecutor.java:907)
> >>>>     at java.lang.Thread.run(Thread.java:619)
> >>>> Caused by: com.IndexingException: [SYSTEM_ERROR] Cannot
> access index
> >>>> [data_dir/index]: [read past EOF]
> >>>>     at
> >>>>
> >>>>
> >>>>
> com.lucene.LuceneSearchService.getSearchResults(LuceneSearchSe
> rvice.java:964)
> >>>>     ... 12 more
> >>>> Caused by: java.io.IOException: read past EOF
> >>>>     at
> >>>>
> >>>>
> >>>>
> org.apache.lucene.store.BufferedIndexInput.refill(BufferedInde
> xInput.java:146)
> >>>>     at
> >>>>
> >>>>
> >>>>
> org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIn
> dexInput.java:38)
> >>>>     at
> org.apache.lucene.store.IndexInput.readInt(IndexInput.java:66)
> >>>>     at
> org.apache.lucene.store.IndexInput.readLong(IndexInput.java:89)
> >>>>     at
> org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:147)
> >>>>     at
> >>>>
> org.apache.lucene.index.SegmentReader.document(SegmentReader.java:659)
> >>>>     at
> >>>>
> >>>>
> >>>>
> org.apache.lucene.index.MultiSegmentReader.document(MultiSegme
> ntReader.java:257)
> >>>>     at
> >>>>
> org.apache.lucene.index.IndexReader.document(IndexReader.java:525)
> >>>>
> >>>> and the second one is
> >>>>
> >>>> at
> >>>>
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.
> java:441)
> >>>>     at
> >>>>
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> >>>>     at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> >>>>     at
> >>>>
> >>>>
> >>>>
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadP
> oolExecutor.java:885)
> >>>>     at
> >>>>
> >>>>
> >>>>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolE
> xecutor.java:907)
> >>>>     at java.lang.Thread.run(Thread.java:619)
> >>>> Caused by: java.lang.IllegalArgumentException: attempt
> to access a
> >>>> deleted
> >>>> document
> >>>>     at
> >>>>
> org.apache.lucene.index.SegmentReader.document(SegmentReader.java:657)
> >>>>     at
> >>>>
> >>>>
> >>>>
> org.apache.lucene.index.MultiSegmentReader.document(MultiSegme
> ntReader.java:257)
> >>>>     at
> >>>>
> org.apache.lucene.index.IndexReader.document(IndexReader.java:525)
> >>>> This is on master index .
> >>>>
> >>>>
> >>>>
> >>>> Any help is appreciated
> >>>>
> >>>> Thanks.
> >>>>
> >>>> --
> >>>> View this message in context:
> >>>>
> >>>>
> http://www.nabble.com/Caused-by%3A-java.io.IOException%3A-read
> -past-EOF-on-Slave-tp19682684p19682684.html
> >>>> Sent from the Lucene - Java Users mailing list archive
> at Nabble.com.
> >>>>
> >>>>
> >>>>
> ---------------------------------------------------------------------
> >>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >>>> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>>>
> >>>
> >>>
> >>>
> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >>> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>>
> >>>
> >>
> >>
> >>
> >> --
> >> Marcelo F. Ochoa
> >> http://marceloochoa.blogspot.com/
> >> http://marcelo.ochoa.googlepages.com/home
> >> ______________
> >> Do you Know DBPrism? Look @ DB Prism's Web Site
> >> http://www.dbprism.com.ar/index.html
> >> More info?
> >> Chapter 17 of the book "Programming the Oracle Database
> using Java &
> >> Web Services"
> >> http://www.amazon.com/gp/product/1555583296/
> >> Chapter 21 of the book "Professional XML Databases" - Wrox Press
> >> http://www.amazon.com/gp/product/1861003587/
> >> Chapter 8 of the book "Oracle & Open Source" - O'Reilly
> >> http://www.oreilly.com/catalog/oracleopen/
> >>
> >>
> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >
> >
> >
> ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
>
>
>
> --
> Marcelo F. Ochoa
> http://marceloochoa.blogspot.com/
> http://marcelo.ochoa.googlepages.com/home
> ______________
> Want to integrate Lucene and Oracle?
> http://marceloochoa.blogspot.com/2007/09/running-lucene-inside
> -your-oracle-jvm.html
> Is Oracle 11g REST ready?
> http://marceloochoa.blogspot.com/2008/02/is-oracle-11g-rest-ready.html
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message