lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Writing an Analyzer for storing and retrieving a payload (was: Storing additional Metadata with Fields)
Date Fri, 15 Oct 2010 18:13:17 GMT
Have you seen:
http://www.lucidimagination.com/blog/2009/08/05/getting-started-with-payloads/
<http://www.lucidimagination.com/blog/2009/08/05/getting-started-with-payloads/>
?

And I don't think payloads are added unless they're specified in the term.
And even if they are, is your index big enough to care?

2010/10/15 Christoph Hermann <hermann@informatik.uni-freiburg.de>

> Am Donnerstag, 14. Oktober 2010, 14:43:41 schrieb Christoph Hermann:
>
> Hello,
>
> > It seems Playload gets added to
> > every term in the index, so in my case i would store the x,y and page
> > values for every word and increase the index much more than i'd need.
> > Any approach for preventing this?
> >
> > And when searching, how can i access the payloads when displaying the
> > result? I haven't found information on that so far.
>
> Is there any example on how to use payloads?
> And the above questions are still valid.
>
> My current problem is that i've written a ContentHandler, that parses the
> extended html from tika and sets boost values on created fields, but it
> seems
> that i need to move all this to the Analyzer since using boosts on Fields
> with
> the same name has no real effect?
> I.e.
> add(new Field("contents","foo"))
> add(new Field("contents","bar").setBoost(1.5f))
>
> => gets one "content" field with a common boost value?
>
> If i'm correct, how would i proceed to achieve the desired effect?
>
> Put all the HTML from the <body> (from tika) in one content field, and let
> the
> Analyzer do the work?
>
> Is there an example of an Analyzer that uses playloads available somewhere?
>
> regards
> Christoph Hermann
>
> --
> Christoph Hermann
> Institut für Informatik
> Tel: +49 761-203-8171 Fax: +49 761-203-8162
> e-mail: hermann@informatik.uni-freiburg.de
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message