Return-Path: X-Original-To: apmail-lucene-dev-archive@www.apache.org Delivered-To: apmail-lucene-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8E493D617 for ; Tue, 20 Nov 2012 21:11:04 +0000 (UTC) Received: (qmail 74685 invoked by uid 500); 20 Nov 2012 21:10:59 -0000 Delivered-To: apmail-lucene-dev-archive@lucene.apache.org Received: (qmail 74577 invoked by uid 500); 20 Nov 2012 21:10:59 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 74519 invoked by uid 99); 20 Nov 2012 21:10:59 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 20 Nov 2012 21:10:59 +0000 Date: Tue, 20 Nov 2012 21:10:59 +0000 (UTC) From: "Simon Willnauer (JIRA)" To: dev@lucene.apache.org Message-ID: <1323334343.8948.1353445859306.JavaMail.jiratomcat@arcas> In-Reply-To: <829569411.80739.1352299992530.JavaMail.jiratomcat@arcas> Subject: [jira] [Commented] (LUCENE-4547) DocValues field broken on large indexes MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/LUCENE-4547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13501466#comment-13501466 ] Simon Willnauer commented on LUCENE-4547: ----------------------------------------- bq. This is sounding too complicated. a single boolean is too complicated? all I ask for is a way to prevent loading into ram if not necessary. We had this in 4.0 and I think we should make this work in 4.1 too. remember this is a different use-case than postings. I really don't this I ask for much here. > DocValues field broken on large indexes > --------------------------------------- > > Key: LUCENE-4547 > URL: https://issues.apache.org/jira/browse/LUCENE-4547 > Project: Lucene - Core > Issue Type: Bug > Reporter: Robert Muir > Priority: Blocker > Fix For: 4.1 > > Attachments: test.patch > > > I tried to write a test to sanity check LUCENE-4536 (first running against svn revision 1406416, before the change). > But i found docvalues is already broken here for large indexes that have a PackedLongDocValues field: > {code} > final int numDocs = 500000000; > for (int i = 0; i < numDocs; ++i) { > if (i == 0) { > field.setLongValue(0L); // force > 32bit deltas > } else { > field.setLongValue(1<<33L); > } > w.addDocument(doc); > } > w.forceMerge(1); > w.close(); > dir.close(); // checkindex > {code} > {noformat} > [junit4:junit4] 2> WARNING: Uncaught exception in thread: Thread[Lucene Merge Thread #0,6,TGRP-Test2GBDocValues] > [junit4:junit4] 2> org.apache.lucene.index.MergePolicy$MergeException: java.lang.ArrayIndexOutOfBoundsException: -65536 > [junit4:junit4] 2> at __randomizedtesting.SeedInfo.seed([5DC54DB14FA5979]:0) > [junit4:junit4] 2> at org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:535) > [junit4:junit4] 2> at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:508) > [junit4:junit4] 2> Caused by: java.lang.ArrayIndexOutOfBoundsException: -65536 > [junit4:junit4] 2> at org.apache.lucene.util.ByteBlockPool.deref(ByteBlockPool.java:305) > [junit4:junit4] 2> at org.apache.lucene.codecs.lucene40.values.FixedStraightBytesImpl$FixedBytesWriterBase.set(FixedStraightBytesImpl.java:115) > [junit4:junit4] 2> at org.apache.lucene.codecs.lucene40.values.PackedIntValues$PackedIntsWriter.writePackedInts(PackedIntValues.java:109) > [junit4:junit4] 2> at org.apache.lucene.codecs.lucene40.values.PackedIntValues$PackedIntsWriter.finish(PackedIntValues.java:80) > [junit4:junit4] 2> at org.apache.lucene.codecs.DocValuesConsumer.merge(DocValuesConsumer.java:130) > [junit4:junit4] 2> at org.apache.lucene.codecs.PerDocConsumer.merge(PerDocConsumer.java:65) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional commands, e-mail: dev-help@lucene.apache.org