lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <gsing...@syr.edu>
Subject Re: TermFreqVector
Date Fri, 18 Nov 2005 15:28:00 GMT
Hi Anna,

The sample I sent is from a modified version of the demo (line 87 in
HTMLDocument of the latest code) that I am preparing for my ApacheCon
talk (which will cover Term Vectors, amongst other things).

At any rate, if you look at the Field constructor for 1.4.3:
|*Field
<http://lucene.apache.org/java/docs/api/org/apache/lucene/document/Field.html#Field%28java.lang.String,%20java.lang.String,%20boolean,%20boolean,%20boolean,%20boolean%29>*(String


<http://java.sun.com/j2se/1.4/docs/api/java/lang/String.html> name,
String
<http://java.sun.com/j2se/1.4/docs/api/java/lang/String.html> string,
boolean store, boolean index, boolean token, boolean storeTermVector)|
(see the Javadocs:
http://lucene.apache.org/java/docs/api/org/apache/lucene/document/Field.html)

or 1.9:

public Field(String name, Reader reader, TermVector termVector)

you can see the last parameter is what controls the addition of
TermVectors.  So depending on your code base, change the code in the
demo to add in the new flag (for 1.4.3) or TermVector.YES (for 1.9). You
will have to reindex upon making this change.


Anna Buczak wrote:

>>You have to tell lucene to store term freq
>>vectors (it isn't done by default).
>>    
>>
>This is exactly the part that I do not know how to do.  Where to set the
>flag ?
>I use for indexing org.apache/lucene.demo.IndexFiles.
>
>  
>
>>Do you have at least
>>one field?
>>    
>>
>Now I know that Lucene adds three fields by default and one of them is
>"contents" - this is the one I am interested in, and at the present time I
>can retieve from that field what I want (i.e. this part works).
>
>Anna
>
>
>Chris Lamprecht wrote:
>
>  
>
>>Can you post the code you're using to create the Document and adding
>>it to the IndexWriter?   You have to tell lucene to store term freq
>>vectors (it isn't done by default).  Also I'm not sure what you mean
>>when you say your documents do not have fields.  Do you have at least
>>one field?
>>
>>-chris
>>
>>On 11/17/05, Anna Buczak <abuczak@sarnoff.com> wrote:
>>    
>>
>>>I have indexed a set of documents that do not have fields.  I want to
>>>use the getTermFreqVector method from IndexReader to get the
>>>frequencies.  However when I do that as:
>>>
>>>TermFreqVector[] z = ir.getTermFreqVectors(0);
>>>
>>>z is null.  So I can't get the frequency vectors.
>>>
>>>Help will be very much appreciated.
>>>
>>>Anna
>>>
>>>
>>>
>>>---------------------------------------------------------------------
>>>To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>>      
>>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>For additional commands, e-mail: java-user-help@lucene.apache.org
>>    
>>
>
>--
>Dr. Anna L. Buczak                          Email: abuczak@sarnoff.com
>Technology Leader                           Phone: 609-734-2667
>Sarnoff Corporation                         Fax:   609-734-2662
>201 Washington Rd.
>Princeton, NJ 08543-5300
>
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>  
>

-- 
-------------------------------------------------------------------
Grant Ingersoll
Sr. Software Engineer
Center for Natural Language Processing
Syracuse University
School of Information Studies
337 Hinds Hall
Syracuse, NY 13244

http://www.cnlp.org
Voice:  315-443-5484
Fax: 315-443-6886



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message