lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chong, Herb" <>
Subject RE: inter-term correlation [was Re: Vector Space Model in Lucene?]
Date Mon, 17 Nov 2003 13:58:46 GMT
you cannot layer sentence boundary detection on top of Lucene and post process the hit list
without effectively building a completely new search engine index. if i am going to go to
this trouble, there is no point to using Lucene at all.


-----Original Message-----
From: Tatu Saloranta []
Sent: Friday, November 14, 2003 8:30 PM
To: Lucene Users List
Subject: Re: inter-term correlation [was Re: Vector Space Model in Lucene?]

Hmmh? You implied that there are some useful distance heuristics (words
5 words apart or more correlate much less), and others have pointed out Lucene 
has many useful components.

Building more complex system from small components is usually considered a 
Good Thing (tm), not an "ad hoc solution". In fact, I would guess most 
experienced people around here start with Lucene defaults, and build their 
own systems gradually customizing more and more of pieces.
It may be there are actual fundamental problems with Lucene, regarding 
approach you'd prefer, but I don't think it makes sense to brush off 
suggestions regarding distance  & fuzzy/sloppy queries by claiming they are 
"just hacks".

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message