lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Phil Rosen <pro...@optaros.com>
Subject term vectors
Date Wed, 15 Nov 2006 16:11:32 GMT
Hello,

 

Thanks in advance for your help, I am really stumped I feel.

 

I am building an application that requires I index a set of documents on
the scale of hundreds of thousands.

 

A document can have a varying number of attribute fields with an unknown
set of potential values. I realize that just indexing a blob of fields
would be much faster, however I need to bin the search results based on
common attributes; as different types of attributes could potentially have
overlapping values a single blob for all attributes wont work.

 

My question is this, is there a way to get term frequencies for a set of
documents or hits, without using getTermFreqVector() on each document and
each attribute field? As I could have hundreds of results, each with
dozens of attribute fields, looping getTermFreqVector() would be very
slow. If there isn't something inherent to lucene, has anyone seen an
extension that could accomplish this?

 

Thanks!


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message