Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 29192 invoked from network); 28 Jun 2007 20:26:17 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 28 Jun 2007 20:26:17 -0000 Received: (qmail 95505 invoked by uid 500); 28 Jun 2007 20:26:18 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 95451 invoked by uid 500); 28 Jun 2007 20:26:18 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 95440 invoked by uid 99); 28 Jun 2007 20:26:18 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 28 Jun 2007 13:26:18 -0700 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received-SPF: neutral (herse.apache.org: local policy) Received: from [208.97.132.66] (HELO spunkymail-a15.dreamhost.com) (208.97.132.66) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 28 Jun 2007 13:26:11 -0700 Received: from [192.168.0.2] (adsl-074-229-189-244.sip.rmo.bellsouth.net [74.229.189.244]) by spunkymail-a15.dreamhost.com (Postfix) with ESMTP id 168A67F03E for ; Thu, 28 Jun 2007 13:25:49 -0700 (PDT) Mime-Version: 1.0 (Apple Message framework v752.2) In-Reply-To: <19331758.1183060025554.JavaMail.jira@brutus> References: <19331758.1183060025554.JavaMail.jira@brutus> Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Message-Id: <9399251F-A0E4-4940-9081-EAAF62DFBF17@apache.org> Content-Transfer-Encoding: 7bit From: Grant Ingersoll Subject: Re: [jira] Commented: (LUCENE-848) Add supported for Wikipedia English as a corpus in the benchmarker stuff Date: Thu, 28 Jun 2007 16:25:48 -0400 To: java-dev@lucene.apache.org X-Mailer: Apple Mail (2.752.2) X-Virus-Checked: Checked by ClamAV on apache.org On Jun 28, 2007, at 3:47 PM, Doron Cohen (JIRA) wrote: > > [ https://issues.apache.org/jira/browse/LUCENE-848? > page=com.atlassian.jira.plugin.system.issuetabpanels:comment- > tabpanel#action_12508922 ] > > Doron Cohen commented on LUCENE-848: > ------------------------------------ > > Steven wrote: >> I think Mike mentioned not doing the one file per article. I'll >> try to look at that ... > > Perhaps also (re) consider the "compress and add on-the-fly" > approach, similar to what TrecDocmaker is doing? > > Grant wrote: >> I take back my promise to commit, I am getting (after processing >> 189500 docs): >> [java] Error: cannot execute the algorithm! term out of order >> ("docid:disrs".compareTo("docname:disregardle >> >> &*Ar") <= 0) >> [java] org.apache.lucene.index.CorruptIndexException: term out >> of order ("docid:disrs".compareTo("docname:disregardle >> >> &*Ar") <= 0) > > Just to verify that it is not a benchmark issue, could you also > post here the executed algorithm (as printed, or, if not printed, > the actual file)...? It is the one in the patch. I ran "ant enwiki" --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org