lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Soeren Pekrul <>
Subject Re: Store a document-like map
Date Wed, 06 Dec 2006 17:24:50 GMT
If there would be a boost factor for a single keyword (term) at index 
time I would index a class as a document with the keys as keywords and 
values as boost factor. Unfortunately you can just boost documents and 
fields at index time. Single terms can only be boosted at search time 
(TermQuery.setBoost). So I would store the classes in Lucene, in a 
database or just in a file and index the documents with Lucene. For 
classification I would iterate the classes and do searches over the 
documents for each class. The queries are “SHOULD” BooleanQueries of 
TermQueries for each key of the class boosted by its value. The score of 
a matching document is the similarity to the class (to the query build 
from the class). I’m not sure using the normalized score of Hits or the 
raw score of a HitCollector.

Sören wrote:
> Hi,
> I'm building an application that's going to classify some documents. So i have a set
of documents and a set of classes, and I must classify these docs in these classes. Now, documents
are stored in Lucene index through Document, while I don't know how I can store my classes
in Lucene, and how I can compare a Document to a class.
> My class is only a map where the key is the word and the value is the relevance for that
class. For example: I made class "Football", this contains this map:
> Key | Value
> ball | 0.8
> penalty| 0.9
> Someone can help me? My resolution idea was to build a Document with the words ball repeated
8 times and penalty 9 times, but it isn't a native mode to compare a map with a document...
> Thank you ahead of time...
> ------------------------------------------------------
> Francesco ha perso ben 45 Kg! Scopri come! Clicca qui

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message