Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 71802 invoked from network); 16 Oct 2007 20:15:43 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 16 Oct 2007 20:15:43 -0000 Received: (qmail 32169 invoked by uid 500); 16 Oct 2007 20:15:29 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 32122 invoked by uid 500); 16 Oct 2007 20:15:29 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 32111 invoked by uid 99); 16 Oct 2007 20:15:29 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 16 Oct 2007 13:15:29 -0700 X-ASF-Spam-Status: No, hits=-100.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO brutus.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 16 Oct 2007 20:15:41 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id B82F37141EB for ; Tue, 16 Oct 2007 13:14:50 -0700 (PDT) Message-ID: <2429455.1192565690751.JavaMail.jira@brutus> Date: Tue, 16 Oct 2007 13:14:50 -0700 (PDT) From: "Michael McCandless (JIRA)" To: java-dev@lucene.apache.org Subject: [jira] Updated: (LUCENE-1012) Problems with maxMergeDocs parameter In-Reply-To: <4611634.1191218990862.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/LUCENE-1012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-1012: --------------------------------------- Assignee: Michael McCandless > Problems with maxMergeDocs parameter > ------------------------------------ > > Key: LUCENE-1012 > URL: https://issues.apache.org/jira/browse/LUCENE-1012 > Project: Lucene - Java > Issue Type: Bug > Components: Index > Reporter: Michael Busch > Assignee: Michael McCandless > Priority: Minor > Fix For: 2.3 > > > I found two possible problems regarding IndexWriter's maxMergeDocs value. I'm using the following code to test maxMergeDocs: > {code:java} > public void testMaxMergeDocs() throws IOException { > final int maxMergeDocs = 50; > final int numSegments = 40; > > MockRAMDirectory dir = new MockRAMDirectory(); > IndexWriter writer = new IndexWriter(dir, new WhitespaceAnalyzer(), true); > writer.setMergePolicy(new LogDocMergePolicy()); > writer.setMaxMergeDocs(maxMergeDocs); > Document doc = new Document(); > doc.add(new Field("field", "aaa", Field.Store.YES, Field.Index.TOKENIZED, Field.TermVector.WITH_POSITIONS_OFFSETS)); > for (int i = 0; i < numSegments * maxMergeDocs; i++) { > writer.addDocument(doc); > //writer.flush(); // uncomment to avoid the DocumentsWriter bug > } > writer.close(); > > new SegmentInfos.FindSegmentsFile(dir) { > protected Object doBody(String segmentFileName) throws CorruptIndexException, IOException { > SegmentInfos infos = new SegmentInfos(); > infos.read(directory, segmentFileName); > for (int i = 0; i < infos.size(); i++) { > assertTrue(infos.info(i).docCount <= maxMergeDocs); > } > return null; > } > }.run(); > } > {code} > > - It seems that DocumentsWriter does not obey the maxMergeDocs parameter. If I don't flush manually, then the index only contains one segment at the end and the test fails. > - If I flush manually after each addDocument() call, then the index contains more segments. But still, there are segments that contain more docs than maxMergeDocs, e. g. 55 vs. 50. The javadoc in IndexWriter says: > {code:java} > /** > * Returns the largest number of documents allowed in a > * single segment. > * > * @see #setMaxMergeDocs > */ > public int getMaxMergeDocs() { > return getLogDocMergePolicy().getMaxMergeDocs(); > } > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org