lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rebecca Watson <bec.wat...@gmail.com>
Subject Re: retrieving Payload 3.0.1
Date Wed, 09 Jun 2010 02:32:12 GMT
Hi aad,

See the search.payload package if you want examples of reading in payloads
at query time for scoring purposes, but returning the payload/ using it to
highlight will require you to write more custom lucene classes.

We work with synonyms too, but rather than store the synonym in payload like
you do, we store any number of synonyms in the same position ie the standard
trick for index-time synonym injection -- better because phrase queries will
match across synonyms in the same doc. However, we define synonyms /
equivalent terms prior to indexing so we don't use an anaylzer to add the
extra terms-- instead we use an analyzer that sets the position increment
attribute to 1 if it hits a special character and otherwise sets it to 0.
See for eg of setting position attribute:
org.apache.lucene.analysis.position.PositionFilter

So eg if we use '/' to indicate new position we would use:

 ... / institute organisation / ....

We apply this filter early on and it's still compatible with other
lucene analysers.

If you injected new synonyms in an analyzer, you would keep setting
the term attribute until new synonyms are all added,

then once you move to the next term, change position increment to 1
and update the term attribute. For the first synonym, make sure you
set the position increment attribute back to 0.

Hope that helps,

bec :)


Sent from my iPhone

On 08/06/2010, at 3:19 PM, Aad Nales <aad.nales@gmail.com> wrote:

Hi All,

I storing synonyms in an index. e.g. 'institute' as a synonym for
'organization'. Since I want to highlight the orginal term when
showing the result i am storing a Payload with each synononym. So in
this case the term 'institute' has a payload for 'organization'. I
execute a search and the document is found. Now i want to create the
highlighting and here is where it goes wrong. I am unable to figure
out how to 'read' the Payload from the document. Perhaps i am looking
at it the wrong way? What i want to avoid is having to expand my
search query with the synonyms. Is there anybody who could give me a
hint how to go about this in lucene 3.0.1

tia,
Aad

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message