lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject Re: Multiword Highlighting
Date Fri, 16 Feb 2007 17:47:27 GMT
That may be the difference then. I'm actually working with both a complete
index and a memory index, depending on what phase I'm in. It turns out that
I probably can't put the document in a memoryindex on the fly
because...well...because <G>... That said, though, I can pretty easily use
this as a base for my application, and many thanks for providing it in the
first place.... (the other Mark too).

Erick

On 2/16/07, Mark Miller <markrmiller@gmail.com> wrote:
>
> >
> >
> > 1> my test cases throw some exceptions with the code as-is. The
> > spans.get(0)
> > is a problem in that it's not guaranteed that the spans returned will
> > have
> > anything in them. Also, I don't think that the test for
> > reqSpans.get(0).next
> > in queryClauses[i].isRequired is correct (even if it doesn't throw
> > exceptions). Isn't the sense there that we want to include the spans
> > if we
> > *do* have entries??
>
> You should get a Spans object back no matter what, hence the .get(0). If
> the Spans object returned has no Spans in it, then the first call to
> next will return false.
> > 2> But more importantly, I think this throws things in the "span bucket"
> > across documents. Consicer two documents with text "a b c d e f" is in
> > one
> > document, and "x y z" is in another, and we query on "a AND z", it seems
> > like extractSpansFromTermQuery would return one span from each document,
> > which would satisfy the tests in getSpansFromBooleanQuery
> > inappropriately.
> This might be the case. I have not considered it...I am working with
> real hit highlighting and so I only work with a single document at a time.
> >
> > Is it just me or is working with Spans really intended to be "one pass
> > through and only forward"? There are several places in the
> SpansExtractor
> > code where we want to ask "are there any spans in here?". But to ask
> > that,
> > you have to call next(). Which changes the state of the Spans such
> > that you
> > have to be really careful when you use any Spans that have had this test
> > performed already and do a do..while (spans.next()); rather than a
> > while (
> > spans.next()) {}..... Ditto with skipTo.
> Could be. I'll do some testing.
>
> I haven't thrown any exceptions yet, but I am working with a single doc
> in a memoryindex. So far I have yet to see a problem. I will keep looking.
>
>
> - Mark
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message