lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <gsing...@syr.edu>
Subject Re: Sentence boundary storage
Date Mon, 31 Oct 2005 02:15:28 GMT
Actually, I was thinking of writing something along the lines of 
Span*BoundaryQuery where it would be more explicit than what was 
described below.  You could say SpanSentence and say you want the terms 
to occur w/in two sentences.  Do you guys think it is worthwhile to 
codify what is discussed below into a few convenience Span queries, or 
maybe we should just write it up better and put on the wiki or something...

Any preference?

Doug Cutting wrote:

> Chris Hostetter wrote:
>
>> : One thing that I know has bogged me is when matching a phrase where I
>> : would expect mathematical formula (which is "just a subphrase"). I
>> : would have liked the phrase-query to extend as far as it wishes but 
>> not
>> : passed a given token... would this be possible ?
>> : Presumably a period token and this feature would have provided the 
>> same?
>>
>> I haven't tried it myself, but my reading of SpanQueries leads me to
>> believe you could accomplish what you want (and what Grant describes) by
>> inserting special Terms to denote
>> formula/sentance/paragraph/section/chapter boundaries, and then use
>> SpanNearQueries with a high slop in conjunction with a
>> SpanNotQuery using a SpanTermQuery for the boundary you don't want to
>> cross.
>
>
> I have not tried this either, but it was one of the use cases when 
> designing span queries.  So it should work.
>
> Doug
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message