lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vaijanath N. Rao" <>
Subject Re: Lucene Indexing structure
Date Sun, 04 May 2008 13:35:49 GMT
Hi Chris,

Sorry for the cross-posting and also for not making clear the problem. 
Let me try to explain the problem at my hand.

I am tying to write a CBIR (Content Based Image Reterival)  frame work 
using lucene. As each document have entities such as title, description, 
author and so on. I am decomposing each image and extracting features 
like color histogram, texture and other important attributes from every 
image and indexing it in lucene such a way that each of this attribute 
is a field. I convert the float values as string for every feature that 
I have extracted from the image.

While searching for similar image I extract the same set of features for 
the query Image and than query lucene to get all those images which have 
atleast one of the features, than I do the re-ranking according to the 
difference of the features. Once the re-ranking is done I submit the 

Here is where I need help, I need to know an optimal way to store the 
values, so that searching take less time and I don't have to re-ranking. 
Is there any way I can compare array of values rather than one value.  
What I essentially need is to get the query of type, give me all those 
features which are less than K distance from the current feature.

--Thanks and Regagrds

Chris Hostetter wrote:
> : Hi Lucene-user and Lucene-dev,
> Please do not cross post -- java-user is the suitable place for your 
> question.
> : Obviously there is something wrong with the above approach (as to get the
> : correct document we need to get all the documents and than do the required
> : distance calculation), but that' due to lack of my knowledge of Luce and
> : lucene's Index storage.
> : 
> : What I want to know how to improve upon the exsisting architecture other than
> : making number of fields in the lucene equalling to total number of
> : feature*size of each feature.
> I suspect one of the reasons you haven't gotten much of a response yet is 
> that people may not understand your problem statement -- I know nothing of 
> Image Processing and even after googling "Color Histogram" I don't really 
> understand how the examples you gave represent Color Histograms, or what 
> it would mean to search on it with your example input.
> Perhaps you could describe in more detail what exactly some sample 
> data looks like, why certian objects should match certain queries, (and 
> just as importantly: why other objects shouldn't match, and give examples 
> of one one object is a "better" match then another object for each example 
> query.
> don't worry about Lucene Document/Field/QueryParse specifics -- just 
> explain the concepts you are dealing with.
> -Hoss

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message