Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 15102 invoked from network); 19 Jun 2005 09:22:02 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 19 Jun 2005 09:22:02 -0000 Received: (qmail 78789 invoked by uid 500); 19 Jun 2005 09:21:56 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 78740 invoked by uid 500); 19 Jun 2005 09:21:55 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 78705 invoked by uid 99); 19 Jun 2005 09:21:55 -0000 X-ASF-Spam-Status: No, hits=1.4 required=10.0 tests=NO_REAL_NAME,PRIORITY_NO_NAME X-Spam-Check-By: apache.org Received-SPF: neutral (hermes.apache.org: local policy) Received: from bounce-software.com (HELO bounce-software.com) (82.76.107.31) by apache.org (qpsmtpd/0.29) with SMTP; Sun, 19 Jun 2005 02:21:55 -0700 Received: (qmail 1430 invoked by uid 204); 19 Jun 2005 09:21:35 -0000 X-Virus-Scan: Scanned by clamdmail(Bounce Software patch, clamav) on bounce-software.com (no viruses); Sun, 19 Jun 2005 12:21:35 +0300 Received: from dazoot.intranet.bounce-software.com (192.168.100.2) by bounce-software.com with SMTP; 19 Jun 2005 09:21:35 -0000 Date: Sun, 19 Jun 2005 12:17:13 +0300 From: catalin-lucene@dazoot.ro Reply-To: Catalin Constantin Organization: Bounce Software X-Priority: 3 (Normal) Message-ID: <1779000970.20050619121713@bounce-software.com> To: java-user@lucene.apache.org Subject: md5 keyword field issue MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Rating: bounce-software.com 1.6.2 0/600/N X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N Hi there, i have an index with the following infos in it: url - keyword - Field("url", this.url, Field.Store.YES, Field.Index.UN_TOKENIZED); md5 - keyword - Field("md5", this.url, Field.Store.YES, Field.Index.UN_TOKENIZED); alt - Field("alt", this.alt, Field.Store.YES, Field.Index.TOKENIZED); i use it to index my images. now it happens that the same image (eg: same md5) is used in different locations (eg: different urls). filename: mylogo.gif used in http://site.com/project1/mylogo.gif and also http://site.com/project2/some_other_bubu/mylogo.gif the ALT is different (eg: different text) now on my image search app when i search mylogo i get "several" results with the same image. i would like to reduce the nr of results in that way that the md5 is unique. Note: i can't delete from the index the 2nd image cause the ALT might be different, so in general all the properties put together (md5, url, alt) compose a different "entity". i bought "Lucene in Action" book, which is a GREAT book. i was looking into "filters". i quote: "If all the information needed to perform filtering is in the index, there is no need to write your own filter because QueryFilter can handle it." i can't seem to figure it out, how query filter can help me. also tried to write my own filter but not that much info on that direction either. any info, links, thoughts, would be highly appreciated ! -- Catalin Constantin http://www.dazoot.ro/ --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org