lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: trm.seekCeil() not giving proper value when used in MP Query for some words
Date Thu, 31 Oct 2013 14:00:54 GMT
You can instantiate StandardAnalyzer, passing an empty stopwords set.
Or make a custom analyzer that doesn't insert StopFilter ...

I'm not aware of any changes in how WhitespaceAnalyzer(Tokenizer)
tokenizes between 3.6.x and 4.x; both versions seem to use
Character.isWhitespace to detect which characters to tokenize on.  So
it's odd you're seeing a difference in behavior between the two
versions.

Mike McCandless

http://blog.mikemccandless.com


On Thu, Oct 31, 2013 at 7:57 AM, VIGNESH S <vigneshklncit@gmail.com> wrote:
> Hi Mike,
>
> I can not use other analyzers since they involve stop words..
>
> I need to just index every word..
>
> I have used WhitespaceAnalyer in Lucene 3.6 and it is indexing
> properly..But this problem iam facing in Lucene 4.3 only..
>
>
> Thanks and Regards
> Vignesh Srinivasan
>
>
> On Thu, Oct 31, 2013 at 4:12 PM, Michael McCandless <
> lucene@mikemccandless.com> wrote:
>
>> Pick a better analyzer.
>>
>> Maybe StandardAnalyzer?
>>
>> Mike McCandless
>>
>> http://blog.mikemccandless.com
>>
>>
>> On Thu, Oct 31, 2013 at 2:22 AM, VIGNESH S <vigneshklncit@gmail.com>
>> wrote:
>> > Hi Mike,
>> >
>> > I am using white space analyzer with lower case filter. The test code is
>> > same as i send above.
>> >
>> > The contents i am indexing is
>> >
>> >         String contents = "•Check for vulnerable ports  •Check for old
>> and
>> > vulnerable versions of services on open ports  •Transfer a code which";
>> >
>> >    In that "Check" is not getting indexed properly since it has the
>> symbol
>> > "•"..How can i index it properly..
>> >
>> >
>> >
>> >
>> >
>> >
>> > On Thu, Oct 31, 2013 at 9:58 AM, VIGNESH S <vigneshklncit@gmail.com>
>> wrote:
>> >
>> >> Hi Mike,
>> >> I got the problem.The term is not indexed properly..
>> >>
>> >>
>> >> On Thu, Oct 31, 2013 at 7:19 AM, VIGNESH S <vigneshklncit@gmail.com
>> >wrote:
>> >>
>> >>> Hi Mike,
>> >>>
>> >>> please find tha attached test case G1.java..
>> >>>
>> >>>
>> >>> On Wed, Oct 30, 2013 at 8:41 PM, Michael McCandless <
>> >>> lucene@mikemccandless.com> wrote:
>> >>>
>> >>>> I don't see any java sources here?
>> >>>>
>> >>>> Make sure "check" is in fact being indexed; can you boil it down
to a
>> >>>> small test case?
>> >>>>
>> >>>> Mike McCandless
>> >>>>
>> >>>> http://blog.mikemccandless.com
>> >>>>
>> >>>>
>> >>>> On Wed, Oct 30, 2013 at 10:59 AM, VIGNESH S <vigneshklncit@gmail.com>
>> >>>> wrote:
>> >>>> > Hi,
>> >>>> >
>> >>>> > I have indexed the below text file "filename.txt" using the
test
>> code
>> >>>> > G1.java..
>> >>>> >
>> >>>> > When I search for "check for old" trm.seekceil() method gives
>> >>>> "checking" and
>> >>>> > "checks" and ignores "check" which is there in text document..
>> >>>> >
>> >>>> > It is working for most cases except a few
>> >>>> >
>> >>>> > Please kindly help me..
>> >>>> >
>> >>>> > --
>> >>>> > Thanks and Regards
>> >>>> > Vignesh Srinivasan
>> >>>> >
>> >>>> >
>> >>>> >
>> >>>> >
>> ---------------------------------------------------------------------
>> >>>> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> >>>> > For additional commands, e-mail: java-user-help@lucene.apache.org
>> >>>>
>> >>>> ---------------------------------------------------------------------
>> >>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> >>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>> >>>>
>> >>>>
>> >>>
>> >>>
>> >>> --
>> >>> Thanks and Regards
>> >>> Vignesh Srinivasan
>> >>> 9739135640
>> >>>
>> >>
>> >>
>> >>
>> >> --
>> >> Thanks and Regards
>> >> Vignesh Srinivasan
>> >> 9739135640
>> >>
>> >
>> >
>> >
>> > --
>> > Thanks and Regards
>> > Vignesh Srinivasan
>> > 9739135640
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>
>
> --
> Thanks and Regards
> Vignesh Srinivasan
> 9739135640

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message