lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Phil Whelan <phil...@gmail.com>
Subject Re: Searching doubt
Date Tue, 04 Aug 2009 15:37:39 GMT
On Tue, Aug 4, 2009 at 8:31 AM, Shai Erera<serera@gmail.com> wrote:
> Hi Darren,
>
> The question was, how given a string "aboutus" in a document, you can return
> that document as a result to the query "about us" (note the space). So we're
> mostly discussing how to detect and then break the word "aboutus" to two
> words.

When traversing Japanese text you have a use a similar algorithm to
searching a maze (keep left and retrace your steps). It's possible to
go a long way along sentence before you find the tokens you've already
picked out are invalid. Rough example...

thereallibrary
there allibrary
there all i brary (fail)
the reallibrary
the real library

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message