Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 78038 invoked from network); 12 Dec 2005 22:58:43 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 12 Dec 2005 22:58:42 -0000 Received: (qmail 7592 invoked by uid 500); 12 Dec 2005 22:58:39 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 7557 invoked by uid 500); 12 Dec 2005 22:58:39 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 7545 invoked by uid 99); 12 Dec 2005 22:58:39 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 12 Dec 2005 14:58:39 -0800 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=HTML_MESSAGE X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: local policy) Received: from [63.150.46.16] (HELO MX4.salesforce.com) (63.150.46.16) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 12 Dec 2005 14:58:38 -0800 X-Ironport-AV: i="3.99,245,1131350400"; d="scan'217,208"; a="130027207:sNHT157675424" Received: from 10.0.11.3 by TMB-SF01 with ESMTP (Tumbleweed Email Firewall SMTP Relay (Email Firewall v6.2.0)); Mon, 12 Dec 2005 14:55:07 -0800 X-Server-Uuid: 4F76DE20-B023-4AFC-9B8B-CC78D6CA5946 Received: from exsfo-mb03.internal.salesforce.com ([10.0.199.42]) by EX-GW1.internal.salesforce.com with Microsoft SMTPSVC(5.0.2195.6713); Mon, 12 Dec 2005 14:58:10 -0800 X-MimeOLE: Produced By Microsoft Exchange V6.5.6944.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Subject: RE: IndexWriter.addIndexes(Directory[] dirs) Date: Mon, 12 Dec 2005 14:58:10 -0800 Message-ID: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: IndexWriter.addIndexes(Directory[] dirs) Thread-Index: AcX/Iy1JJVzpsrAqTA2lpRd5yqvYugASnNOQAABz6fA= From: "Kevin Oliver" To: java-dev@lucene.apache.org X-OriginalArrivalTime: 12 Dec 2005 22:58:10.0590 (UTC) FILETIME=[8614F3E0:01C5FF6F] X-WSS-ID: 6F8320411V82066359-01-01 Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C5FF6F.8624E8C5" X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N ------_=_NextPart_001_01C5FF6F.8624E8C5 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable I see it stripped my attachment off. Here's the code: =20 =20 import junit.framework.TestCase; =20 import org.apache.lucene.analysis.standard.StandardAnalyzer; import org.apache.lucene.document.Document; import org.apache.lucene.document.Field; import org.apache.lucene.index.IndexWriter; import org.apache.lucene.index.Term; import org.apache.lucene.search.*; import org.apache.lucene.store.Directory; import org.apache.lucene.store.FSDirectory; =20 public class AddIndexesTest extends TestCase { =20 public AddIndexesTest(String name) { super(name); } =20 public void testAddIndexes() throws Exception { { Directory dir1 =3D FSDirectory.getDirectory("/dev/searchdata/addIndexesTest1", true); IndexWriter writer1 =3D new IndexWriter(dir1, new StandardAnalyzer(), true); =20 Document doc1 =3D new Document(); doc1.add(Field.UnIndexed("ID", "id1")); doc1.add(Field.UnStored("f", "some words")); writer1.addDocument(doc1); =20 writer1.close(); dir1.close(); =20 IndexSearcher searcher =3D new IndexSearcher("/dev/searchdata/addIndexesTest1"); Hits hits =3D searcher.search(new TermQuery(new Term("f", "words"))); assertEquals(1, hits.length()); searcher.close(); } =20 { Directory dir2 =3D FSDirectory.getDirectory("/dev/searchdata/addIndexesTest2", true); IndexWriter writer2 =3D new IndexWriter(dir2, new StandardAnalyzer(), true); =20 Document doc1 =3D new Document(); doc1.add(Field.UnIndexed("ID", "id2")); doc1.add(Field.UnStored("f", "some other words")); writer2.addDocument(doc1); =20 writer2.close(); dir2.close(); =20 IndexSearcher searcher =3D new IndexSearcher("/dev/searchdata/addIndexesTest2"); Hits hits =3D searcher.search(new TermQuery(new Term("f", "words"))); assertEquals(1, hits.length()); searcher.close(); } =20 =20 Directory dir =3D FSDirectory.getDirectory("/dev/searchdata/addIndexesTest1", false); IndexWriter writer =3D new IndexWriter(dir, new StandardAnalyzer(), false); writer.addIndexes(new Directory[] { FSDirectory.getDirectory("/dev/searchdata/addIndexesTest2", false) }); writer.close(); dir.close(); =20 IndexSearcher searcher =3D new IndexSearcher("/dev/searchdata/addIndexesTest1"); Hits hits =3D searcher.search(new TermQuery(new Term("f", "words"))); assertEquals(2, hits.length()); searcher.close(); } =20 } =20 =20 -----Original Message----- From: Kevin Oliver=20 Sent: Monday, December 12, 2005 2:53 PM To: java-dev@lucene.apache.org Subject: RE: IndexWriter.addIndexes(Directory[] dirs) =20 Volodymyr, I tried this patch out, and unfortunately it doesn't appear to work for me. Is there something I missed? =20 I'll try attaching my Junit test case that works when the code is unpatched, but fails on the final assertion expecting 2 hits (on line 63) when I used the patched IndexWriter.java.=20 =20 Thanks,=20 Kevin =20 =20 -----Original Message----- From: Volodymyr Bychkoviak [mailto:vbychkoviak@i-hypergrid.com]=20 Sent: Monday, December 12, 2005 5:51 AM To: java-dev@lucene.apache.org Subject: IndexWriter.addIndexes(Directory[] dirs) =20 IndexWriter in addIndexes(Directory[] dirs) method optimizes index=20 before and after operation. =20 Some notes about this: 1). Adding sub indexes to large index can take long because of double=20 optimization. 2). This breaks IndexWriter.maxMergeDocs logic, because optimize will=20 merge data into single segment index. =20 I suggest add new method with boolean parameter to optionally specify=20 whether index should be optimized. =20 There is similar method addIndexes(IndexReader[] readers) in IndexWriter =20 that takes array of IndexReaders but I don't know how it can be modified =20 to provide same optional functionality =20 Patch attached here to discuss it first (should I post it directly to jira?) =20 =20 --=20 regards, Volodymyr Bychkoviak =20 =20 =20 =20 =20 ------_=_NextPart_001_01C5FF6F.8624E8C5--