lucene-solr-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Solr Wiki] Update of "Payloads" by JanHoydahl
Date Tue, 01 Mar 2011 14:46:53 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The "Payloads" page has been changed by JanHoydahl.
The comment on this change is: Describing payloads.
http://wiki.apache.org/solr/Payloads

--------------------------------------------------

New page:
Payloads are byte arrays (optionally) stored with every term on a field. Payloads may be used
for several use cases in Solr, like boosting certain terms over others. Another use is [[http://en.wikipedia.org/wiki/Part-of-speech_tagging|POS
tagging]].

=== Term boosting with Payloads ===
Imagine the following use cases:
* You index a large HTML page in your body field, but want to boost words in headings and
with boldface
* You want to boost all nouns (say you have a clever client side parser which detects nouns)
* Indexing German content, you want to boost all names more than other (capitalized) nouns
* You have a product called "Word" and want to boost all occurrences over the lowercase "word"
words :)

All these caes may be solved by introducing payloads in your scoring, given you have the client
side magic to detect the words. The beauty is that since the boost is stored with the term,
you do not need any heavy parsing or calculation query time. All documents originally containing
"Word" would surf up higher in your results, even if everything is lowercased both on index
and query side. And even if HTML markup is lost after parsing, your parser would already have
tagged the titles and boldface words before removing the markup.

For a step-by-step description on how to enable Payload boosting, see Lucid's blog post [[http://www.lucidimagination.com/blog/2009/08/05/getting-started-with-payloads/|here]].

Mime
View raw message