lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From catalin-luc...@dazoot.ro
Subject md5 keyword field issue
Date Sun, 19 Jun 2005 09:17:13 GMT
Hi there,

i have an index with the following infos in it:
url - keyword - Field("url", this.url, Field.Store.YES, Field.Index.UN_TOKENIZED);
md5 - keyword - Field("md5", this.url, Field.Store.YES, Field.Index.UN_TOKENIZED);
alt - Field("alt", this.alt, Field.Store.YES, Field.Index.TOKENIZED);

i use it to index my images.
now it happens that the same image (eg: same md5) is used in different
locations (eg: different urls).
filename: mylogo.gif used in
http://site.com/project1/mylogo.gif and also
http://site.com/project2/some_other_bubu/mylogo.gif

the ALT is different (eg: different text)

now on my image search app when i search mylogo i get "several"
results with the same image.

i would like to reduce the nr of results in that way that the md5 is
unique.
Note: i can't delete from the index the 2nd image cause the ALT might
be different, so in general all the properties put together (md5, url,
alt) compose a different "entity".


i bought "Lucene in Action" book, which is a GREAT book.
i was looking into "filters".

i quote: "If all the information needed to perform filtering is in the
index, there is no need to write your own filter because QueryFilter
can handle it."

i can't seem to figure it out, how query filter can help me.

also tried to write my own filter but not that much info on that
direction either.


any info, links, thoughts, would be highly appreciated !

-- 
Catalin Constantin
http://www.dazoot.ro/


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message