lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bill Janssen <jans...@parc.com>
Subject Re: lucene-core-2.2.0.jar broken? CorruptIndexException?
Date Thu, 29 Nov 2007 10:34:46 GMT
> Can you try running with the trunk version of Lucene (2.3-dev) and see
> if the error still occurs?  EG you can download this AM's build here:
> 
>   http://lucene.zones.apache.org:8080/hudson/job/Lucene-Nightly/288/artifact/artifacts

Still there.  Here's the dump with last night's build:

/Library/Java/Home/bin/java '-Dcom.parc.uplib.indexing.debugMode=true' '-Dcom.parc.uplib.indexing.indexProperties=contents:title:categories$,*:date@:apparent-mime-type*:authors$\sand\s:comment:abstract:email-message-id*:email-guid*:email-subject:email-from-name:email-from-address*:email-attachment-to*:email-thread-index*:email-references$,*:email-in-reply-to$,*:keywords$,*:album:performer:composer:music-genre*:audio-length:accompaniment:paragraph-ids$,*:sha-hash*'
-classpath "/local/uplib/share/UpLib-1.7/code/lucene-core-2.3-2007-11-29_02-49-31.jar:/local/uplib/share/UpLib-1.7/code/LuceneIndexing.jar"
-Dorg.apache.lucene.writeLockTimeout=20000 com.parc.uplib.indexing.LuceneIndexing "/local/janssen-uplib/index"
update /local/janssen-uplib/docs 01179-00-0750-547 01178-90-9186-558 01178-81-4212-772 01178-81-3305-217
01178-73-1029-141 01178-72-8365-803
updating
doc_root_dir is /local/janssen-uplib/docs
IFD [main]: setInfoStream deletionPolicy=org.apache.lucene.index.KeepOnlyLastCommitDeletionPolicy@462851
IW 0 [main]: setInfoStream: dir=org.apache.lucene.store.FSDirectory@/local/janssen-uplib/index
autoCommit=true mergePolicy=org.apache.lucene.index.LogByteSizeMergePolicy@c56c60 mergeScheduler=org.apache.lucene.index.ConcurrentMergeScheduler@4e280c
ramBufferSizeMB=16.0 maxBuffereDocs=-1 maxBuffereDeleteTerms=-1 maxFieldLength=10000 index=_21:c19686
_22:c92
IW 0 [main]: setMaxFieldLength 2147483647
Working on document /local/janssen-uplib/docs/01179-00-0750-547
  Adding header 'abstract' IT to 01179-00-0750-547
  Adding header 'apparent-mime-type' I to 01179-00-0750-547
  Adding header 'authors' IT to 01179-00-0750-547
  Adding header 'categories' I (ebooks) to 01179-00-0750-547
  Adding header 'categories' I (economics) to 01179-00-0750-547
  Adding header 'categories' I (paper) to 01179-00-0750-547
  Adding header 'citation' I to 01179-00-0750-547
  Adding header 'date' I (20070128) to 01179-00-0750-547
  Adding header 'sha-hash' I to 01179-00-0750-547
  Adding header 'title' IT (Heterogeneity in Price Stickiness and the Real Effects of Monetary
Shocks) to 01179-00-0750-547
  Created empty doc Document<stored/uncompressed,indexed<id:01179-00-0750-547> stored/uncompressed,indexed<uplibdate:20070512>
stored/uncompressed,indexed<uplibtype:whole>>
  Using charset utf8 for contents.txt
  Using language en for contents.txt
    page 0 (2181):  Heterogeneity in Price Stickin
    page 1 (2927):  1 Introduction There is ample 
    page 2 (3135):  In the presence of strategic c
    page 3 (3128):  Motivated by those questions, 
    page 4 (3214):  ploring the tractability of th
    page 5 (2491):  model with Taylor staggered wa
    page 6 (1548):  real rigidities (Ball and Rome
    page 7 (3098):  2.2 Calibrating the sectoral d
    page 8 (1913):  distribution of price stickine
    page 9 (1952):  reported in Table 1. Hencefort
    page 10 (1635):  Figure 2 presents analogous re
    page 11 (1743):  In the absence of strategic co
    page 12 (2806):  Corollary 1 For an arbitrary h
    page 13 (2380):  2.4.2 Growth rate shocks In th
    page 14 (2962):  price changes. With heterogene
    page 15 (3265):  ties and heterogeneity in the 
    page 16 (1962):  complementarities. The results
    page 17 (751):  to the response of the heterog
    page 18 (489):  economies are embedded into th
    page 19 (3295):  2.6 Fitting IRFs with an ident
    page 20 (2066):  Table 3a: Best-Fitting Duratio
    page 21 (2444):  This is an important step beca
    page 22 (1976):  where ? is the discount factor
    page 23 (1183):  Et "? Ct+1 Ct ?? It Pt Pt+1 #
    page 24 (2188):  can be rewritten as: Pk,t =  
    page 25 (1370):  pt = Z 1 0 f (k) pk,tdk, (10) 
    page 26 (3269):  Heterogeneity in price stickin
    page 27 (3117):  Irrespective of the net effect
    page 28 (2084):  set of parameters involve high
    page 29 (575):  0 5 10 15 20 25 30 35 40 0 x 1
    page 30 (2185):  output and falling prices in a
    page 31 (2358):  price changes that minimizes t
    page 32 (2689):  These results are fully consis
    page 33 (3600):  different sources of real rigi
    page 34 (3168):  work in a model with heterogen
    page 35 (2557):  single equation estimation of 
    page 36 (1326):  Taking the limit as  ? 0 in e
    page 37 (1796):  The output gap is constant at 
    page 38 (1066):  The corresponding path for the
    page 39 (1347):  4) Proof of Corollaries 1 and 
    page 40 (2421):  Therefore, for ?  0, the expe
    page 41 (1343):  p (t) = Z 1 0 f (k) ? ?? ?? R 
    page 42 (2117):  As ? ? 0, this clearly converg
    page 43 (1375):  model around the zero inflatio
    page 44 (1497):  pt = Z 1 0 f (k) pk,tdk, yt = 
    page 45 (1128):  Table A.3: Best-Fitting Durati
    page 46 (898):  Multiplying by f (k) ?k and in
    page 47 (1072):  Now, from (23): ?kxk,t = pk,t 
    page 48 (268):  Finally, let t ? pt ? pt?1 de
    page 49 (1694):  References [1] Altissimo, F., 
    page 50 (1874):  [14] Bils, M., P. Klenow and O
    page 51 (2091):  [27] Carlton, D. (1986), The 
    page 52 (1846):  [39] Dixon, H. and E. Kara (20
    page 53 (1530):  [51] Ohanian, L., A. Stockman 
  Using charset utf8 for contents.txt
  Using language en for contents.txt
Added 01179-00-0750-547 (55 versions)
Working on document /local/janssen-uplib/docs/01178-90-9186-558
  Adding header 'abstract' IT to 01178-90-9186-558
  Adding header 'apparent-mime-type' I to 01178-90-9186-558
  Adding header 'authors' IT to 01178-90-9186-558
  Adding header 'authors' IT to 01178-90-9186-558
  Adding header 'authors' IT to 01178-90-9186-558
  Adding header 'authors' IT to 01178-90-9186-558
  Adding header 'authors' IT to 01178-90-9186-558
  Adding header 'categories' I (paper) to 01178-90-9186-558
  Adding header 'categories' I (ebooks) to 01178-90-9186-558
  Adding header 'date' I (20050500) to 01178-90-9186-558
  Adding header 'sha-hash' I to 01178-90-9186-558
  Adding header 'title' IT (Visual-Syntactic Text Formatting: A New Method to Enhance Online)
to 01178-90-9186-558
  Created empty doc Document<stored/uncompressed,indexed<id:01178-90-9186-558> stored/uncompressed,indexed<uplibdate:20070511>
stored/uncompressed,indexed<uplibtype:whole>>
  Using charset utf8 for contents.txt
  Using language en for contents.txt
    page 0 (3332):  Visual-Syntactic Text Formatti
    page 1 (1461):  into this: To make these chang
    page 2 (4620):  complex than a simple, concate
    page 3 (4723):  Among some poorer readers, met
    page 4 (3827):  method does not extract or dir
    page 5 (3388):  (ACT scores) with gains in com
    page 6 (3693):  digital text actually improve 
    page 7 (3400):  exam were administered. For re
    page 8 (3413):  Intermediate and long-term ret
    page 9 (4219):  Student preference and survey 
    page 10 (4417):  Discussion In print media, sim
    page 11 (4358):  time; however, the opposite tr
    page 12 (3950):  More time spent actually readi
    page 13 (4184):  In education, the VSTF method 
    page 14 (3270):  References Armbruster, B.B. (2
    page 15 (3176):  April 1920). Neuroimaging, la
    page 16 (3350):  Klare, G.R., Nichols, W.H., & 
    page 17 (4098):  and narrative skills connect w
    page 18 (3070):  He has conducted laboratory-ba
  Using charset utf8 for contents.txt
  Using language en for contents.txt
Added 01178-90-9186-558 (20 versions)
Working on document /local/janssen-uplib/docs/01178-81-4212-772
  Adding header 'abstract' IT to 01178-81-4212-772
  Adding header 'apparent-mime-type' I to 01178-81-4212-772
  Adding header 'categories' I (newspaper) to 01178-81-4212-772
  Adding header 'categories' I (article) to 01178-81-4212-772
  Adding header 'categories' I (fun) to 01178-81-4212-772
  Adding header 'categories' I (historical) to 01178-81-4212-772
  Adding header 'date' I (19340429) to 01178-81-4212-772
  Adding header 'sha-hash' I to 01178-81-4212-772
  Adding header 'title' IT (Gigantic Robots, Controlled by Wireless, to Fight Our Battles)
to 01178-81-4212-772
  Created empty doc Document<stored/uncompressed,indexed<id:01178-81-4212-772> stored/uncompressed,indexed<uplibdate:20070510>
stored/uncompressed,indexed<uplibtype:whole>>
  Using charset utf8 for contents.txt
  Using language en for contents.txt
    page 0 (261):  iganlic obok ntroIIed b ireles
  Using charset utf8 for contents.txt
  Using language en for contents.txt
Added 01178-81-4212-772 (2 versions)
Working on document /local/janssen-uplib/docs/01178-81-3305-217
  Adding header 'apparent-mime-type' I to 01178-81-3305-217
  Adding header 'authors' IT to 01178-81-3305-217
  Adding header 'categories' I (cartoon) to 01178-81-3305-217
  Adding header 'keywords' I (incentive) to 01178-81-3305-217
  Adding header 'sha-hash' I to 01178-81-3305-217
  Created empty doc Document<stored/uncompressed,indexed<id:01178-81-3305-217> stored/uncompressed,indexed<uplibdate:20070510>
stored/uncompressed,indexed<uplibtype:whole>>
Added 01178-81-3305-217 (1 versions)
Working on document /local/janssen-uplib/docs/01178-73-1029-141
  Adding header 'apparent-mime-type' I to 01178-73-1029-141
  Adding header 'authors' IT to 01178-73-1029-141
  Adding header 'categories' I (article) to 01178-73-1029-141
  Adding header 'date' I (20070514) to 01178-73-1029-141
  Adding header 'sha-hash' I to 01178-73-1029-141
  Adding header 'title' IT (Critical Mass:  Everyone listens to Walter Mossberg) to 01178-73-1029-141
  Created empty doc Document<stored/uncompressed,indexed<id:01178-73-1029-141> stored/uncompressed,indexed<uplibdate:20070509>
stored/uncompressed,indexed<uplibtype:whole>>
  Using charset utf8 for contents.txt
  Using language en for contents.txt
    page 0 (328):  Go Back Print this page Skip t
    page 1 (1024):  Mossberg assesses technology p
    page 2 (4281):  that Mossberg has since descri
    page 3 (2610):  Titus said, It will come with
    page 4 (4412):  Wed love that, Mermelstein 
    page 5 (4035):  half years, and then transferr
    page 6 (4952):  partly to the clout of the new
    page 7 (5049):  Mossberg will often be the fir
    page 8 (4330):  Of the blogs that review produ
    page 9 (31):   Del.icio.us  Reddit Object 
  Using charset utf8 for contents.txt
  Using language en for contents.txt
Added 01178-73-1029-141 (11 versions)
Working on document /local/janssen-uplib/docs/01178-72-8365-803
  Adding header 'apparent-mime-type' I to 01178-72-8365-803
  Adding header 'authors' IT to 01178-72-8365-803
  Adding header 'categories' I (ebooks) to 01178-72-8365-803
  Adding header 'categories' I (article) to 01178-72-8365-803
  Adding header 'date' I (20061104) to 01178-72-8365-803
  Adding header 'sha-hash' I to 01178-72-8365-803
  Adding header 'title' IT (Selling Ebooks on the Web via MHTML) to 01178-72-8365-803
  Created empty doc Document<stored/uncompressed,indexed<id:01178-72-8365-803> stored/uncompressed,indexed<uplibdate:20070509>
stored/uncompressed,indexed<uplibtype:whole>>
  Using charset utf8 for contents.txt
  Using language en for contents.txt
    page 0 (3499):  To: ebook-community@yahoogroup
    page 1 (2904):  the IETF world, to be the resp
  Using charset utf8 for contents.txt
  Using language en for contents.txt
Added 01178-72-8365-803 (3 versions)
Optimizing...
IW 0 [main]: optimize: index now _21:c19686 _22:c92
IW 0 [main]:   flush: segment=_23 docStoreSegment=_23 docStoreOffset=0 flushDocs=true flushDeletes=true
flushDocStores=true numDocs=92 numBufDelTerms=6
IW 0 [main]:   index before flush _21:c19686 _22:c92

closeDocStore: 2 files to flush to segment _23

flush postings as segment _23 numDocs=92
  oldRAMSize=528092 newFlushedSize=263152 docs/MB=366.59 new/old=49.831%
IW 0 [main]: flush 6 buffered deleted terms on 3 segments.
flushed 92 deleted documents
IW 0 [main]: checkpoint: wrote segments file "segments_48"
IFD [main]: now checkpoint "segments_48" [3 segments ; isCommit = true]
IFD [main]: deleteCommits: now remove commit "segments_47"
IFD [main]: delete "segments_47"
IW 0 [main]: checkpoint: wrote segments file "segments_49"
IFD [main]: now checkpoint "segments_49" [3 segments ; isCommit = true]
IFD [main]: deleteCommits: now remove commit "segments_48"
IFD [main]: delete "_23.fnm"
IFD [main]: delete "_23.frq"
IFD [main]: delete "_23.prx"
IFD [main]: delete "_23.tis"
IFD [main]: delete "_23.tii"
IFD [main]: delete "_23.nrm"
IFD [main]: delete "_23.fdx"
IFD [main]: delete "_23.fdt"
IFD [main]: delete "segments_48"
IW 0 [main]: LMP: findMerges: 3 segments
IW 0 [main]: LMP:   level 6.744677 to 7.494677: 1 segments
IW 0 [main]: LMP:   level -1.0 to 5.513348: 2 segments
IW 0 [main]: CMS: now merge
IW 0 [main]: CMS:   index: _21:c19686 _22:c92 _23:c92
IW 0 [main]: CMS:   no more merges pending; now return
IW 0 [main]: add merge to pendingMerges: _21:c19686 _22:c92 _23:c92 [optimize] [total 1 pending]
IW 0 [main]: CMS: now merge
IW 0 [main]: CMS:   index: _21:c19686 _22:c92 _23:c92
IW 0 [main]: CMS:   consider merge _21:c19686 _22:c92 _23:c92 into _24 [optimize]
IW 0 [main]: CMS:     launch new thread [Thread-0]
IW 0 [Thread-0]: CMS:   merge thread: start
IW 0 [main]: CMS:   no more merges pending; now return
IW 0 [Thread-0]: now merge
  merge=_21:c19686 _22:c92 _23:c92 into _24 [optimize]
  index=_21:c19686 _22:c92 _23:c92
IW 0 [Thread-0]: merging _21:c19686 _22:c92 _23:c92 into _24 [optimize]
IW 0 [Thread-0]: merge: total 19686 docs
IW 0 [Thread-0]: hit exception during merge; now refresh deleter on segment _24
IFD [Thread-0]: refresh [prefix=_24]: removing newly created unreferenced file "_24.fdt"
IFD [Thread-0]: delete "_24.fdt"
IFD [Thread-0]: refresh [prefix=_24]: removing newly created unreferenced file "_24.fdx"
IFD [Thread-0]: delete "_24.fdx"
IFD [Thread-0]: refresh [prefix=_24]: removing newly created unreferenced file "_24.fnm"
IFD [Thread-0]: delete "_24.fnm"
IFD [Thread-0]: refresh [prefix=_24]: removing newly created unreferenced file "_24.frq"
IFD [Thread-0]: delete "_24.frq"
IFD [Thread-0]: refresh [prefix=_24]: removing newly created unreferenced file "_24.prx"
IFD [Thread-0]: delete "_24.prx"
IFD [Thread-0]: refresh [prefix=_24]: removing newly created unreferenced file "_24.tii"
IFD [Thread-0]: delete "_24.tii"
IFD [Thread-0]: refresh [prefix=_24]: removing newly created unreferenced file "_24.tis"
IFD [Thread-0]: delete "_24.tis"
IW 0 [Thread-0]: hit exception during merge
Exception in thread "Thread-0" org.apache.lucene.index.MergePolicy$MergeException: java.lang.ArrayIndexOutOfBoundsException:
Array index out of range: 20672
	at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:274)
Caused by: java.lang.ArrayIndexOutOfBoundsException: Array index out of range: 20672
	at org.apache.lucene.util.BitVector.get(BitVector.java:72)
	at org.apache.lucene.index.SegmentTermDocs.next(SegmentTermDocs.java:118)
	at org.apache.lucene.index.SegmentTermPositions.next(SegmentTermPositions.java:95)
	at org.apache.lucene.index.SegmentMerger.appendPostings(SegmentMerger.java:467)
	at org.apache.lucene.index.SegmentMerger.mergeTermInfo(SegmentMerger.java:430)
	at org.apache.lucene.index.SegmentMerger.mergeTermInfos(SegmentMerger.java:402)
	at org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:366)
	at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:123)
	at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:3002)
	at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:2751)
	at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:240)
java.io.IOException: background merge hit exception: _21:c19686 _22:c92 _23:c92 into _24 [optimize]
	at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:1705)
	at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:1654)
	at com.parc.uplib.indexing.LuceneIndexing.update(LuceneIndexing.java:419)
	at com.parc.uplib.indexing.LuceneIndexing.main(LuceneIndexing.java:664)
Caused by: java.lang.ArrayIndexOutOfBoundsException: Array index out of range: 20672
	at org.apache.lucene.util.BitVector.get(BitVector.java:72)
	at org.apache.lucene.index.SegmentTermDocs.next(SegmentTermDocs.java:118)
	at org.apache.lucene.index.SegmentTermPositions.next(SegmentTermPositions.java:95)
	at org.apache.lucene.index.SegmentMerger.appendPostings(SegmentMerger.java:467)
	at org.apache.lucene.index.SegmentMerger.mergeTermInfo(SegmentMerger.java:430)
	at org.apache.lucene.index.SegmentMerger.mergeTermInfos(SegmentMerger.java:402)
	at org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:366)
	at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:123)
	at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:3002)
	at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:2751)
	at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:240)
janssen-home : /u 75 % 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message