lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Lea <ian....@gmail.com>
Subject Re: Can't get tokenization/stop works working
Date Mon, 01 Feb 2010 09:27:52 GMT
If you make com a stop word then you won't be able to search for it,
but a search for fubar should have worked.  Are you sure your analyzer
is doing what you want?  You don't tell us what analyzer you are
using.

Tips:
  use Luke to see what has been indexed
  read the FAQ entry
http://wiki.apache.org/lucene-java/LuceneFAQ#Why_am_I_getting_no_hits_.2BAC8_incorrect_hits.3F


--
Ian.

On Mon, Feb 1, 2010 at 7:25 AM, jchang <jchangkihatest@gmail.com> wrote:
>
> I want to be able to store a doc with a field with this as a substring:
>  www.fubar.com
> And then I want this document to get returned when I query on
>  fubar or
>  fubar.com
>
> I assume what I should do is make www and com stop words, and make sure the
> field is tokenized, so it wil break it up along the '.'
>
> I thought  I should take a list of Enlisgh stop words, add in 'www' and com,
> and then make sure the field is tokenized, which I did by using this
> constructor:
> new Field("name", "value",  Field.Store.YES, Field.Index.Analyzed).
> I saw that Field.Index.Analyzed meant it would be tokenized.
>
> It is not working.  Searching on fubar or fubar.com does not return it.
> Thanks for any help.
> --
> View this message in context: http://old.nabble.com/Can%27t-get-tokenization-stop-works-working-tp27400546p27400546.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message