lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <gsing...@syr.edu>
Subject Re: Fwd: Re: Term Vectors
Date Fri, 11 Nov 2005 14:43:06 GMT
If you are storing the term vector when you index, then you can ask the 
IndexReader for the vector using the getTermFreqVector() method, which 
will return the TermFreqVector which should have the information you need

marigoldcc@yahoo.com wrote:

>I hope that this isn't a newbies question, but let me
>ask the more general question.  While IndexReader can
>return the documents containing the term t, I need to
>do the opposite.  Is there a method, given document d,
>that will return all of the terms in that document (I
>need to calculate the average tf and the number of
>unique terms in each document)? 
>
>After indexing a set of plain text files using
>org.apache.lucene.demo.IndexFiles, I looked at
>Document.fields, but all that it returned was:
>
>Text<path:C:\text\02\7laft10.txt>
>Keyword<modified:0efvmrdgi>
>
>Any insight would be appreciated.
>
>Thanks
>-- MG
>
>Chris Hostetter <hossman_lucene@fucit.org> wrote:
>
>  
>
>>------ Original Message ------
>>Received: Fri, 28 Oct 2005 08:22:04 PM EDT
>>From: Chris Hostetter <hossman_lucene@fucit.org>
>>To:  java-user@lucene.apache.org
>>Subject: Re: Term Vectors
>>
>>:  "Now, you can get these term vectors per
>>documents with the Lucene API if
>>the
>>: index was built with the term vectors option."
>>:
>>:   How does one invoke the term vectors option when
>>building the index and
>>: where can one find a list of the various options
>>(I really did try looking
>>at
>>: the docs, but could not find any reference to
>>this).
>>
>>there are very few generic options that apply when
>>"building the index"
>>.. most options are specific to the individual
>>documents as you add them
>>-- you can choose to store the TermVectors for the
>>"FOO" field of one
>>document, but leave them out of another.
>>
>>Options like wether or not a Field is indexed,
>>stored, tokenized, or has
>>it's TermVector stored are all controlled when you
>>construct the Field
>>object prior to adding it to the document...
>>
>>
>>    
>>
>http://lucene.apache.org/java/docs/api/org/apache/lucene/document/Field.html
>  
>
>http://lucene.apache.org/java/docs/api/org/apache/lucene/document/Document.html
>  
>
>>-Hoss
>>
>>
>>
>>
>>
>>
>>    
>>
>
>
>
>		
>__________________________________ 
>Start your day with Yahoo! - Make it your home page! 
>http://www.yahoo.com/r/hs
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>  
>

-- 
------------------------------------------------------------------- 
Grant Ingersoll 
Sr. Software Engineer 
Center for Natural Language Processing 
Syracuse University 
School of Information Studies 
337 Hinds Hall 
Syracuse, NY 13244 

http://www.cnlp.org 
Voice:  315-443-5484 
Fax: 315-443-6886 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message