Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 58972 invoked from network); 13 Dec 2005 22:56:25 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 13 Dec 2005 22:56:25 -0000 Received: (qmail 59254 invoked by uid 500); 13 Dec 2005 22:56:18 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 59232 invoked by uid 500); 13 Dec 2005 22:56:17 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 59221 invoked by uid 99); 13 Dec 2005 22:56:17 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 13 Dec 2005 14:56:17 -0800 X-ASF-Spam-Status: No, hits=1.3 required=10.0 tests=RCVD_NUMERIC_HELO X-Spam-Check-By: apache.org Received-SPF: neutral (asf.osuosl.org: local policy) Received: from [165.212.64.32] (HELO cmsout02.mbox.net) (165.212.64.32) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 13 Dec 2005 14:56:16 -0800 Received: from cmsout02.mbox.net (cmsout02.mbox.net [165.212.64.32]) by cmsout02.mbox.net (Postfix) with ESMTP id BFDBC4CA1A for ; Tue, 13 Dec 2005 22:55:54 +0000 (GMT) Received: from uadvg137.cms.usa.net [165.212.11.137] by cmsout02.mbox.net via smtad (C8.MAIN.3.27I); Tue, 13 Dec 2005 22:55:54 GMT X-USANET-Source: 165.212.11.137 IN iragoldstein@usa.net uadvg137.cms.usa.net X-USANET-MsgId: XID697JLmw432671X02 Received: from cmsweb09.cms.usa.net [165.212.8.9] by uadvg137.cms.usa.net (ASMTP/) via mtad (C8.MAIN.3.27E) with ESMTP id 501JLmw420497M37; Tue, 13 Dec 2005 22:55:53 GMT X-USANET-Auth: 165.212.8.9 AUTO iragoldstein@usa.net cmsweb09.cms.usa.net Received: from 163.153.230.137 [163.153.230.137] by cmsweb09.cms.usa.net (USANET web-mailer CM.0402.7.36); Tue, 13 Dec 2005 22:55:52 -0000 Date: Tue, 13 Dec 2005 17:55:52 -0500 From: Ira Goldstein To: , Subject: Re: Impact of Term Vectors X-Mailer: USANET web-mailer (CM.0402.7.36) Mime-Version: 1.0 Message-ID: <361JLmw416048S09.1134514552@cmsweb09.cms.usa.net> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Z-USANET-MsgId: XID501JLmw420497X37 X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N We've run into an issue with the term vectors. When indexing a small corp= us (~3k docs, 1.3G) everything works fine, as it does with a small number of= documents from TREC-6 (so we believe that our indexing code is ok). Howe= ver, when we tried to index the full TREC-6 corpus (~300,000 docs, 2G) the ter= m vectors all seem to come back as null. When the indexing process is goin= g on, we can see the .frq file being built so it appears as if the index routin= e is doing its thing. Has anyone experienced anything similar? Thanks in advance for any insight you can offer --Ira Goldstein = ------ Original Message ------ Received: Tue, 13 Dec 2005 12:41:07 PM EST From: "Dan Climan" To: Subject: Impact of Term Vectors (was ApacheCon next week) > Good question. I was wondering about the impact of adding term vectors = with > the various options. For example, is adding term vectors with both positions > and offsets a significant impact? Which current parts of lucene (includ= ing > contributions) take advantage of term vectors being present? I know tha= t > Highlighter class can make use of them if present. > = > Dan > = > -----Original Message----- > From: Jeff Rodenburg [mailto:jeff.rodenburg@gmail.com] = > Sent: Monday, December 12, 2005 9:08 PM > To: java-user@lucene.apache.org > Subject: Re: ApacheCon next week > = > Well done, Grant. Very informative. > = > Question on Term Vectors: with their inclusion in an index, have you noticed > any degradation in performance, either from a search effiiciency or > maintenance point-of-view? Given the power of term vectors, if the per= f > impact is negligible, I'm curious to the reasons why one would NOT incl= ude > term vectors in any and every index... > = > cheers, > j > = > = > = > = > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > = > = --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org