Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 45375 invoked from network); 21 Feb 2010 11:23:52 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 21 Feb 2010 11:23:52 -0000 Received: (qmail 39827 invoked by uid 500); 21 Feb 2010 11:23:51 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 39745 invoked by uid 500); 21 Feb 2010 11:23:51 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 39737 invoked by uid 99); 21 Feb 2010 11:23:51 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 21 Feb 2010 11:23:51 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 21 Feb 2010 11:23:48 +0000 Received: from brutus.apache.org (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id EA9AF234C1F2 for ; Sun, 21 Feb 2010 03:23:27 -0800 (PST) Message-ID: <1645938931.419321266751407959.JavaMail.jira@brutus.apache.org> Date: Sun, 21 Feb 2010 11:23:27 +0000 (UTC) From: "Robert Muir (JIRA)" To: java-dev@lucene.apache.org Subject: [jira] Updated: (LUCENE-2269) don't download/extract 20,000 files when doing the build In-Reply-To: <253637937.351131266463707905.JavaMail.jira@brutus.apache.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/LUCENE-2269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-2269: -------------------------------- Attachment: LUCENE-2269.patch great idea Mike, I removed all unzipping code and changed the file to the smaller bz2, which is handled automagically by benchmark. i also added a note about this test for the future: {noformat} NOTE: if the default scoring or StandardAnalyzer is changed, then this test will no work correctly, as it does not dynamically generate its test trec topics/qrels! {noformat} this is nothing new, but in my opinion an improvement in the future would be to dynamically generate these files, it would also test the QualityQueriesFinder functionality, but we would need to add the 'fake documents', etc for the test to work, too. will commit shortly > don't download/extract 20,000 files when doing the build > -------------------------------------------------------- > > Key: LUCENE-2269 > URL: https://issues.apache.org/jira/browse/LUCENE-2269 > Project: Lucene - Java > Issue Type: Test > Components: Build > Reporter: Robert Muir > Assignee: Robert Muir > Priority: Trivial > Fix For: 3.1 > > Attachments: LUCENE-2269.patch, LUCENE-2269.patch, reuters.578.lines.zip > > > When you build lucene, it downloads and extracts some data for contrib/benchmark, especially the 20,000+ files for the reuters corpus. > this is only needed for one test, and these 20,000 files drive IDEs and such crazy. > instead of doing this by default, we should only download/extract data if you specifically ask (like wikipedia, collation do, etc) > for the qualityrun test, instead use a linedoc formatted 587-line text file, similar to reuters.first20.lines.txt already used by benchmark. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org