From java-user-return-46392-apmail-lucene-java-user-archive=lucene.apache.org@lucene.apache.org Wed Jun 09 02:32:16 2010 Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 3656 invoked from network); 9 Jun 2010 02:32:16 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 9 Jun 2010 02:32:16 -0000 Received: (qmail 28970 invoked by uid 500); 9 Jun 2010 02:32:14 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 28779 invoked by uid 500); 9 Jun 2010 02:32:12 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 28766 invoked by uid 99); 9 Jun 2010 02:32:12 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Jun 2010 02:32:12 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=FREEMAIL_FROM,HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of bec.watson@gmail.com designates 209.85.213.176 as permitted sender) Received: from [209.85.213.176] (HELO mail-yx0-f176.google.com) (209.85.213.176) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Jun 2010 02:32:04 +0000 Received: by yxd39 with SMTP id 39so401700yxd.35 for ; Tue, 08 Jun 2010 19:31:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:from:in-reply-to:mime-version :references:date:message-id:subject:to:content-type; bh=pSWg9JBRIXoyEsbB6kml6IjmdIdvBvxr8aKHy9qdQFk=; b=ver/CgbIS8Dw93LydmGQbhdp0wWBcV4dIoUKb+aNCwhnewSl96q6sGlOgp79zIp5ia HzX8+ysn1WVqVO5gidNnX5URD6Fyjo36M1oOiez2PyJG4KEHn8SB8JpAmavDnsvIeD7t wfJPIuiq+PR9v54L3donxRto+mxcvWMCCr52o= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:in-reply-to:mime-version:references:date:message-id:subject:to :content-type; b=ERyBFsVMT+2fe3WBGI+4QLwtg0C3xT401Wog3RR5njcVAxFdH/NpbV6ap1kF7gawUO 9CQ0hR4hjapI+Y7T57iyIl2D3cdg6p9nrqoZq5GU2LxqT1LguTBXsUD06SQ6qbtjwxmv x/AAYhkoxlZCGFLV58U+OsxOAdX4riOA6RLYo= Received: by 10.150.168.1 with SMTP id q1mr1852143ybe.132.1276050702712; Tue, 08 Jun 2010 19:31:42 -0700 (PDT) From: Rebecca Watson In-Reply-To: Mime-Version: 1.0 (iPhone Mail 7E18) References: Date: Wed, 9 Jun 2010 10:32:12 +0800 Message-ID: <1941636010468295749@unknownmsgid> Subject: Re: retrieving Payload 3.0.1 To: "java-user@lucene.apache.org" Content-Type: multipart/alternative; boundary=000e0cd6147cd9c40a04888fb44f X-Virus-Checked: Checked by ClamAV on apache.org --000e0cd6147cd9c40a04888fb44f Content-Type: text/plain; charset=UTF-8 Hi aad, See the search.payload package if you want examples of reading in payloads at query time for scoring purposes, but returning the payload/ using it to highlight will require you to write more custom lucene classes. We work with synonyms too, but rather than store the synonym in payload like you do, we store any number of synonyms in the same position ie the standard trick for index-time synonym injection -- better because phrase queries will match across synonyms in the same doc. However, we define synonyms / equivalent terms prior to indexing so we don't use an anaylzer to add the extra terms-- instead we use an analyzer that sets the position increment attribute to 1 if it hits a special character and otherwise sets it to 0. See for eg of setting position attribute: org.apache.lucene.analysis.position.PositionFilter So eg if we use '/' to indicate new position we would use: ... / institute organisation / .... We apply this filter early on and it's still compatible with other lucene analysers. If you injected new synonyms in an analyzer, you would keep setting the term attribute until new synonyms are all added, then once you move to the next term, change position increment to 1 and update the term attribute. For the first synonym, make sure you set the position increment attribute back to 0. Hope that helps, bec :) Sent from my iPhone On 08/06/2010, at 3:19 PM, Aad Nales wrote: Hi All, I storing synonyms in an index. e.g. 'institute' as a synonym for 'organization'. Since I want to highlight the orginal term when showing the result i am storing a Payload with each synononym. So in this case the term 'institute' has a payload for 'organization'. I execute a search and the document is found. Now i want to create the highlighting and here is where it goes wrong. I am unable to figure out how to 'read' the Payload from the document. Perhaps i am looking at it the wrong way? What i want to avoid is having to expand my search query with the synonyms. Is there anybody who could give me a hint how to go about this in lucene 3.0.1 tia, Aad --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org --000e0cd6147cd9c40a04888fb44f--