lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From DM Smith <dmsmith...@gmail.com>
Subject Document proximity
Date Wed, 30 Mar 2005 12:25:01 GMT
Hi,

I hope I am posting to the right list.

We (sword and jsword at crosswire.org) are indexing bibles with each 
verse becoming a document, with the verse text being indexed and the 
verse reference being stored. This way we can search the text and get 
which verses have hits.

The problem is that verse is an artifical document boundary.

Frequently, verses cut a paragraph into parts, a poem into stanzas, ... 
and the significant parts are across verses. (But we usually don't have 
these in our markup)

Is there any thought of adding a NEAR operator that will work across 
documents?

Specifically, find x NEAR y, where the distance given to near is not 
understood as words but documents.

(We do have a solution that stands entirely outside of lucene, but it 
would be better (for us :) if Lucene had the capability.)

It would also be good to have the ability to have search automatically 
consider that adjacent documents are flowing unless some token in the 
doucment interrupts the flow. In this case, search would return a 
compound document as a hit.
 
Thanks,
    DM Smith


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message