Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 14904 invoked from network); 20 Dec 2009 15:07:04 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 20 Dec 2009 15:07:04 -0000 Received: (qmail 42786 invoked by uid 500); 20 Dec 2009 15:07:02 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 42683 invoked by uid 500); 20 Dec 2009 15:07:02 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 42673 invoked by uid 99); 20 Dec 2009 15:07:02 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 20 Dec 2009 15:07:02 +0000 X-ASF-Spam-Status: No, hits=1.2 required=10.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [85.25.71.29] (HELO mail.troja.net) (85.25.71.29) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 20 Dec 2009 15:06:51 +0000 Received: from localhost (localhost.localdomain [127.0.0.1]) by mail.troja.net (Postfix) with ESMTP id 92FA8D3600B for ; Sun, 20 Dec 2009 16:06:30 +0100 (CET) X-Virus-Scanned: Debian amavisd-new at mail.troja.net Received: from mail.troja.net ([127.0.0.1]) by localhost (megaira.troja.net [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 7QMaoS5YMpGD for ; Sun, 20 Dec 2009 16:06:21 +0100 (CET) Received: from VEGA (port-83-236-62-54.dynamic.qsc.de [83.236.62.54]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.troja.net (Postfix) with ESMTPSA id 690FED36002 for ; Sun, 20 Dec 2009 16:06:21 +0100 (CET) From: "Uwe Schindler" To: References: <18D4C3144AAD4B3DBECCCA84E25A9019@elias> <807070.2130.qm@web52901.mail.re2.yahoo.com> <4F3FB638DB9E4CF286B27113F9E99C85@elias> Subject: RE: Payloads Date: Sun, 20 Dec 2009 16:06:20 +0100 Message-ID: <8EC4871E99A1412F9641BEB0E1F6500C@VEGA> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook 11 Thread-Index: AcqA8OU9U+iUiOtmQVGd40UEawP2SgAib2aQAALJTYA= In-Reply-To: <4F3FB638DB9E4CF286B27113F9E99C85@elias> X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.5579 X-Virus-Checked: Checked by ClamAV on apache.org The problem was solved in #lucene irc channel already. The behaviour of PayloadTermQuery was correct if you compare scores of a document with an even and no-even match in the *same* query. In general: You cannot compare scores on different queries or different indexes. ----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: uwe@thetaphi.de > -----Original Message----- > From: Elias Khsheibun [mailto:elias3@gmail.com] > Sent: Sunday, December 20, 2009 2:51 PM > To: java-user@lucene.apache.org > Subject: RE: Payloads > > > I'm trying to run queries now, the problem is - the scoring of the > BoostingTermQuery is always giving a double weight to even terms, and not > if > the query itself contains the term, here is the code that I'm using: > > > public class DocumentAnalyzer extends Analyzer { > > @Override > public TokenStream tokenStream(String fieldName, Reader reader) { > TokenStream result = new WhitespaceTokenizer(reader); > result = new TermPositionPayloadTokenFilter(result); > > return result; > } > > } > > > public class TermPositionPayloadTokenFilter extends TokenFilter { > > protected PayloadAttribute payAtt; > protected PositionIncrementAttribute posIncrAtt; > > private static final Payload evenPayload = new > Payload(PayloadHelper.encodeFloat(2.0f)); > > private int termPosition = 0; > > public TermPositionPayloadTokenFilter(TokenStream input) { > super(input); > payAtt = (PayloadAttribute) addAttribute(PayloadAttribute.class); > posIncrAtt = (PositionIncrementAttribute) > addAttribute(PositionIncrementAttribute.class); > } > > @Override > public final boolean incrementToken() throws IOException { > if (input.incrementToken()) { > if ((termPosition % 2) == 0) > payAtt.setPayload(evenPayload); > termPosition += posIncrAtt.getPositionIncrement(); > return true; > } else { > return false; > } > } > > } > > > > public class BoostingSimilarity extends DefaultSimilarity { > public float scorePayload(String fieldName, byte[] payload, int > offset, int length) { > if (payload != null) > return PayloadHelper.decodeFloat(payload, offset); > > else > return 1.0F; > } > } > > And this is a test I've written, if you look at the scores, then you will > notice that the BoostingTermQuery is always giving a double weight to even > terms no matter if they appear in the query or no (this is my current > problem now): > > public class PayloadsTest extends TestCase { > Directory dir; > IndexWriter writer; > DocumentAnalyzer analyzer; > protected void setUp() throws Exception { > super.setUp(); > dir = new RAMDirectory(); > analyzer = new DocumentAnalyzer(); > writer = new IndexWriter(dir, analyzer, > IndexWriter.MaxFieldLength.UNLIMITED); > } > protected void tearDown() throws Exception { > super.tearDown(); > writer.close(); > } > void addDoc(String title, String contents) throws IOException { > Document doc = new Document(); > doc.add(new Field("title", > title, > Field.Store.YES, > Field.Index.NO)); > > doc.add(new Field("contents", > contents, > Field.Store.NO, > Field.Index.ANALYZED)); > > writer.addDocument(doc); > } > > public void testBoostingTermQuery() throws Throwable { > addDoc("Hurricane warning", "A hurricane warning was issued at 6 AM > for the outer great banks"); > addDoc("Warning label maker", "The warning label maker is a > delightful toy for your precocious six year old's warning needs"); > addDoc("Tornado warning", "There is a tornado warning for Worcester > county until 6 PM today"); > writer.commit(); > IndexSearcher searcher = new IndexSearcher(dir); > searcher.setSimilarity(new BoostingSimilarity()); > Term warning = new Term("contents", "tornado"); > Query query1 = new TermQuery(warning); > System.out.println("\nTermQuery results:"); > > ScoreDoc [] hits = searcher.search(query1, 10).scoreDocs; > for (int i = 0; i < hits.length; i++) { > Document hitDoc = searcher.doc(hits[i].doc); > System.out.println(hitDoc.get("title")); > } > Query query2 = new BoostingTermQuery(warning); > System.out.println("\nBoostingTermQuery results:"); > > ScoreDoc [] hits2 = searcher.search(query2, 10).scoreDocs; > for (int i = 0; i < hits2.length; i++) { > Document hitDoc = searcher.doc(hits2[i].doc); > System.out.println(hitDoc.get("title")); > } > } > } > > > -----Original Message----- > From: AHMET ARSLAN [mailto:iorixxx@yahoo.com] > Sent: Saturday, December 19, 2009 11:19 PM > To: java-user@lucene.apache.org > Subject: RE: Payloads > > > > If I need to override the QueryParser > > to return PayloadTermQuery, what > > function for PayloadFunction should I use in the > > constructor (If you can > > show me an example). > > I am not sure about that. Maybe custom one. > > > In your code I didn't see an indexer, will this work with > > the regular > > IndexWriter but with the new Analyzer that you overloaded > > No, at index time [IndexWriter] you are going to use a new analyzer that > uses WhitespaceTokenizer + TermPositionPayloadTokenFilter. > > PayloadAnalyzer will be used at query time. [QueryParser] > > You need to setSimilarity(new CustomSimilarity) of both indexer and > searcher. > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org