lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Peter Laurinc" <laur...@felisconsulting.com>
Subject RE: Sentence and Paragraph searching
Date Fri, 01 Jul 2005 15:45:33 GMT
Maybe the solution is have to each term not only position but also
something like vector. Then you can "vectorize it":
term 1 has vector 1, 1 term 2 has vector 1, 1 (1 paragraph, 1 sentence
of this paragraph) , term 3 has (1, 2)
if you set query for searching in paragraph/sentence you only set what
portion of vector must be same. 

Is this the way? 

-----Original Message-----
From: Erik Hatcher [mailto:erik@ehatchersolutions.com] 
Sent: Friday, July 01, 2005 4:04 PM
To: java-user@lucene.apache.org
Subject: Re: Sentence and Paragraph searching


On Jul 1, 2005, at 8:16 AM, Peter Laurinc wrote:

> Hi,
>
> I'm newbie to lucene.
> I wan to ask, how to implement search for phrase that must be in 
> sentence/paragraph.
> I did see som examples, that uses term position changing, but I think 
> that this is not the way, because it breaks classic proximity search.
> (if one word is on end and second of begining of next sentence)

It really depends on your needs.  If you never need proximity across  
sentence boundaries, then what's the issue?   Putting a large gap at  
sentence boundaries makes good sense for some needs.  Maybe not so for
your situation?

I'm definitely interested in what others have done with this sort of
thing.

At the extreme, if all you wanted was to find sentences and did not need
to query for terms in multiple sentences at one time then you could
index each sentence as a separate Document.

     Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message