John,
It actually should be pretty easy to use just the parts of Lucene you
want (the analyzers, etc) without using the rest. See the example of
the PorterStemmer from this article:
http://www.onjava.com/pub/a/onjava/2003/01/15/lucene.html?page=2
You could feed a Reader to the tokenStream() method of
PorterStemAnalyzer, and get back a TokenStream, from which you pull
the tokens using the next() method.
On Wed, 17 Nov 2004 18:54:07 -0500, jeichels@optonline.net
<jeichels@optonline.net> wrote:
>
> Is there a way to use Lucene stemming and stop word removal without using the rest of
the tool? I am downloading the code now, but I imagine the answer might be deeply burried.
I would like to be able to send in a phrase and get back a collection of keywords if possible.
>
> I am thinking of using an intermediary solution before moving fully to Lucene. I don't
have time to spend a month making a carefully tested, administratable Lucene solution for
my site yet, but I intend to do so over time. Funny thing is the Lucene code likely would
only take up a couple hundred of lines, but integration and administration would take me much
more time.
>
> In the meantime, I am thinking I could use perhaps Lucene steming and parsing of words,
then stick each search word along with the associated primary key in an indexed MySql table.
Each record I would need to do this to is small with maybe only average 15 userful words.
I would be able to have an in-database solution though ranking, etc would not exist. This
is better then the exact word searching i have currently which is really bad.
>
> By the way, MySql 4.1.1 has some Lucene type handling, but it too does not have stemming
and I am sure it is very slow compaired to Lucene. Cpanel is still stuck on MySql 4.0.*
so many people would not have access to even this basic ability in production systems for
some time yet.
>
> JohnE
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org
|