lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Retrieving payload attribute in highlighter
Date Tue, 30 Nov 2010 20:49:56 GMT
Warning, ignorance alert. I'm not all that up on the guts of this one.

But take a look at MemoryIndex, there's an example there. The gist
is that you create a MemoryIndex on the fly and index the doc in question
into it, then you can get the IndexReader from the IndexSearcher associated
with the MemoryIndex. All this is similar to how Highlighter works, and it
looks like you have access to the original input?

Now, this is largely a guess, so don't waste time if I'm really off base
with
this.

Best
Erick

On Tue, Nov 30, 2010 at 2:16 PM, Fabiano Nunes <fabiano@nunes.me> wrote:

> Ok. I'll go ahead.
> Just one more thing: the apidocs warning says "(...) IndexReader should
> only
> contain doc of interest, best to use MemoryIndex (..)".
> How can I build a reader with a subset of docs?
>
> Thanks!
>
> On Tue, Nov 30, 2010 at 2:31 PM, Erick Erickson <erickerickson@gmail.com
> >wrote:
>
> > That's the Lucene developers' way of saying "we don't guarantee backwards
> > compatibility".
> >
> > The devs go to great lengths to honor the contract of not changing public
> > APIs without
> > going through a deprecation process, which causes quite a lot of work.
> But
> > that conflicts
> > with the desirable process of having real-live users get a look at new
> > code.
> > So this
> > warning merely says "the next compilation of Lucene may have a different
> > API, so don't
> > blame us if you have to recompile against a new jar". OK, I've played
> > pretty
> > loose with
> > the words, but that's the intent.
> >
> > By and large, though, the functionality will be there in new versions,
> but
> > you may have
> > to change the calls you use if you get new jars.
> >
> > I'd go ahead and use the class/method with confidence that something
> > similar
> > will be there
> > in the future, but you may have to recompile if you get new jars.
> >
> > Best
> > Erick
> >
> > On Tue, Nov 30, 2010 at 11:06 AM, Fabiano Nunes <fabiano@nunes.me>
> wrote:
> >
> > > I've figured out the PayloadSpanUtil class. It's exactly what I'm
> > > expecting.
> > > But, I'm concerned about the warning message in API docs (indeed, I
> think
> > I
> > > dont understand it). There is any other approach? Can I have the same
> > > results retrieving the termPositions without performance issues?
> > >
> > > Thanks.
> > >
> > > On Tue, Nov 30, 2010 at 1:20 PM, Fabiano Nunes <fabiano@nunes.me>
> wrote:
> > >
> > > > Hello,
> > > > I'm trying to retrieve payloads from the highlighteds terms by
> > > Highlighter
> > > > class. In my tests, all terms returned from Highlighter has null as
> > > payload.
> > > > Example:
> > > >
> > > > Highlighter h = new Highlighter(new Formatter() {
> > > > public String highlightTerm(String originalText, TokenGroup
> tokenGroup)
> > {
> > > > Token token = tokenGroup.getToken(0);
> > > >  Payload payload = token.getPayload();
> > > >  assertNotNull(payload); // <---------------- payload is always null
> > > >  return originalText;
> > > > }
> > > > }, scorer);
> > > >
> > > > It seems that Highlighter removes payload attribute from all terms
> > before
> > > > creating its TokenGroup.
> > > > How can I preserve it?
> > > >
> > > > Thanks.
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message