Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 70019 invoked from network); 21 Aug 2008 23:52:35 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 21 Aug 2008 23:52:35 -0000 Received: (qmail 60822 invoked by uid 500); 21 Aug 2008 23:52:26 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 60787 invoked by uid 500); 21 Aug 2008 23:52:26 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 60776 invoked by uid 99); 21 Aug 2008 23:52:26 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 21 Aug 2008 16:52:26 -0700 X-ASF-Spam-Status: No, hits=1.2 required=10.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [64.233.184.225] (HELO wr-out-0506.google.com) (64.233.184.225) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 21 Aug 2008 23:51:28 +0000 Received: by wr-out-0506.google.com with SMTP id c30so250614wra.21 for ; Thu, 21 Aug 2008 16:51:07 -0700 (PDT) Received: by 10.90.83.18 with SMTP id g18mr507883agb.76.1219362667181; Thu, 21 Aug 2008 16:51:07 -0700 (PDT) Received: from ?10.17.4.4? ( [96.237.252.30]) by mx.google.com with ESMTPS id l43sm985287wrl.17.2008.08.21.16.51.06 (version=TLSv1/SSLv3 cipher=RC4-MD5); Thu, 21 Aug 2008 16:51:06 -0700 (PDT) Message-Id: From: Michael McCandless To: java-user@lucene.apache.org In-Reply-To: Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v926) Subject: Re: Lucene Index Structure Date: Thu, 21 Aug 2008 19:51:04 -0400 References: <16f8407d0808211620l20b54385pab1746c9dd389fef@mail.gmail.com> X-Mailer: Apple Mail (2.926) X-Virus-Checked: Checked by ClamAV on apache.org Also, the inverted index *will* store positional information (in the *.prx files) even if term vectors are not stored. Mike Yonik Seeley wrote: > On Thu, Aug 21, 2008 at 7:20 PM, David Lee > wrote: >> Clarification question: >> >> If I don't store term vectors, then I: >> -- won't have information on the position of matching terms >> -- I don't have the term frequency vector >> >> -- but I should still have the frequency of terms per document in >> the .frq >> file, right? >> >> So what's the difference between the term frequency vector and the >> information saved in the .frq file? > > It's how the data can be efficiently accessed... by term or by > document. > Lucene is naturally an inverted index, and thus makes it easy to ask > "what documents contain this term". > Term vectors store the term information indexed by document and make > it easy to ask "what terms does this specific document have". > > -Yonik > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org