lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael D. Curtin" <>
Subject Re: term vectors
Date Wed, 15 Nov 2006 16:34:48 GMT
Phil Rosen wrote:

> I am building an application that requires I index a set of documents on
> the scale of hundreds of thousands.
> A document can have a varying number of attribute fields with an unknown
> set of potential values. I realize that just indexing a blob of fields
> would be much faster, however I need to bin the search results based on
> common attributes; as different types of attributes could potentially have
> overlapping values a single blob for all attributes wont work.
> My question is this, is there a way to get term frequencies for a set of
> documents or hits, without using getTermFreqVector() on each document and
> each attribute field? As I could have hundreds of results, each with
> dozens of attribute fields, looping getTermFreqVector() would be very
> slow. If there isn't something inherent to lucene, has anyone seen an
> extension that could accomplish this?

Could you give an example of what you're starting with, what a search looks 
like, and what you want out?  It sounds almost like you're looking for a 
custom statistical analysis of hits, which I doubt Lucene is going to have for 
you, out of the box ...


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message