lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <gsing...@syr.edu>
Subject Re: Impact of Term Vectors
Date Wed, 14 Dec 2005 15:50:40 GMT
I have successfully used term vectors for both the TREC English and 
Arabic collections.  Take a look at the code I posted at 
http://www.cnlp.org/apachecon2005 or Erik and Otis' book, "Lucene In 
Action", which both have good examples of term vectors.  Perhaps there 
is something with how you are creating your fields.



Ira Goldstein wrote:

>We've run into an issue with the term vectors. When indexing a small corpus
>(~3k docs, 1.3G) everything works fine, as it does with a small number of
>documents from TREC-6 (so we believe that our indexing code is ok).  However,
>when we tried to index the full TREC-6 corpus (~300,000 docs, 2G) the term
>vectors all seem to come back as null.  When the indexing process is going on,
>we can see the .frq file being built so it appears as if the index routine is
>doing its thing.
>
>Has anyone experienced anything similar?
>
>Thanks in advance for any insight you can offer
>--Ira Goldstein
>  
>
>------ Original Message ------
>Received: Tue, 13 Dec 2005 12:41:07 PM EST
>From: "Dan Climan" <dcliman@keepmedia.com>
>To: <java-user@lucene.apache.org>
>Subject: Impact of Term Vectors (was ApacheCon next week)
>
>  
>
>>Good question. I was wondering about the impact of adding term vectors with
>>the various options. For example, is adding term vectors with both
>>    
>>
>positions
>  
>
>>and offsets a significant impact? Which current parts of lucene (including
>>contributions) take advantage of term vectors being present? I know that
>>Highlighter class can make use of them if present.
>>
>>Dan
>>
>>-----Original Message-----
>>From: Jeff Rodenburg [mailto:jeff.rodenburg@gmail.com] 
>>Sent: Monday, December 12, 2005 9:08 PM
>>To: java-user@lucene.apache.org
>>Subject: Re: ApacheCon next week
>>
>>Well done, Grant.  Very informative.
>>
>>Question on Term Vectors: with their inclusion in an index, have you
>>    
>>
>noticed
>  
>
>>any degradation in performance, either from a search effiiciency or
>>maintenance point-of-view?  Given the power of term vectors, if the perf
>>impact is negligible, I'm curious to the reasons why one would NOT include
>>term vectors in any and every index...
>>
>>cheers,
>>j
>>
>>
>>
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>>    
>>
>
>
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>For additional commands, e-mail: java-user-help@lucene.apache.org
>
>  
>

-- 
------------------------------------------------------------------- 
Grant Ingersoll 
Sr. Software Engineer 
Center for Natural Language Processing 
Syracuse University 
School of Information Studies 
337 Hinds Hall 
Syracuse, NY 13244 

http://www.cnlp.org 
Voice:  315-443-5484 
Fax: 315-443-6886 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message