lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peru Redmi <perumyph...@gmail.com>
Subject Re: Understanding Query Parser Behavior
Date Tue, 29 Nov 2016 16:38:43 GMT
Hello ,

It would be great , if someone could help on this.
*Note : I am using Lucene 4.10.4 version*

On Mon, Nov 28, 2016 at 5:37 PM, Peru Redmi <perumyphone@gmail.com> wrote:

> Any help on this would be greatly appreciated.
>
> Thanks.
>
> On Thu, Nov 24, 2016 at 8:14 PM, Peru Redmi <perumyphone@gmail.com> wrote:
>
>>
>> Hello Mike,
>>
>> Here is, how i analyze my text using QueryParser ( with ClassicAnalyzer)
>> and plain ClassicAnalyzer. On checking the same in luke, i get "//"
>> as RegexQuery.
>>
>> Here is my code snippet:
>>
>>         String value = "http\\://www.google.com";
>>>         Analyzer anal = new ClassicAnalyzer(Version.LUCENE_30, new
>>> StringReader(""));
>>>         QueryParser parser = new QueryParser(Version.LUCENE_30, "name",
>>> anal);
>>>         Query query = parser.parse(value);
>>>         System.out.println(" output terms from query parser ::" + query);
>>
>>
>>
>>>
>>>         ArrayList list = new ArrayList();
>>>         TokenStream stream = anal.tokenStream("name", new
>>> StringReader(value));
>>>         stream.reset();
>>>         while (stream.incrementToken())
>>>         {
>>>             list.add(stream.getAttribute(CharTermAttribute.class).toStri
>>> ng());
>>>         }
>>>         System.out.println(" output terms from analyzer " + list);
>>
>>
>>
>> output:
>>
>> output terms from query parser ::name:http name:// name:www.google.com
>> output terms from analyzer [http, www.google.com]
>>
>>
>>
>>
>>
>>
>> On Thu, Nov 24, 2016 at 5:10 PM, Michael McCandless <
>> lucene@mikemccandless.com> wrote:
>>
>>> Hi,
>>>
>>> You should double check which analyzer you are using during indexing.
>>>
>>> The same analyzer on the same string should produce the same tokens.
>>>
>>> Mike McCandless
>>>
>>> http://blog.mikemccandless.com
>>>
>>>
>>> On Wed, Nov 23, 2016 at 9:38 PM, Peru Redmi <perumyphone@gmail.com>
>>> wrote:
>>> > Could someone elaborate this.
>>> >
>>> > On Tue, Nov 22, 2016 at 11:41 AM, Peru Redmi <perumyphone@gmail.com>
>>> wrote:
>>> >
>>> >> Hello,
>>> >> Can you help me out on your "No" .
>>> >>
>>> >> On Mon, Nov 21, 2016 at 11:16 PM, wmartinusa@gmail.com <
>>> >> wmartinusa@gmail.com> wrote:
>>> >>
>>> >>> No
>>> >>>
>>> >>> Sent from my LG G4, an AT&T 4G LTE smartphone
>>> >>>
>>> >>> ------ Original message------
>>> >>> *From: *Peru Redmi
>>> >>> *Date: *Mon, Nov 21, 2016 10:44 AM
>>> >>> *To: *java-user@lucene.apache.org;
>>> >>> *Cc: *
>>> >>> *Subject:*Understanding Query Parser Behavior
>>> >>>
>>> >>> Hello All ,Could someone explain *QueryParser* behavior on these
>>> cases1. While Indexing ,Document doc = new Document();doc.add(new
>>> Field("*Field*", "*http://www.google.com*", Field.Store.YES,
>>> Field.Index.ANALYZED));      index has *two* terms - *http* & *
>>> www.google.com**2.* While searching ,Analyzer anal = new
>>> *ClassicAnalyzer*(Version.LUCENE_30, newStringReader(""));QueryParser
>>> parser=new *MultiFieldQueryParser*(Version.LUCENE_30,
>>> newString[]{"*Field*"},anal);Query query = parser.parse("*
>>> http://www.google.com *");Now , query has *three *terms  -
>>> (Field:http) *(Field://)* (Field:www.google.com)i) Why I have got 3
>>> terms while parsing , and 2 terms on indexing (Usingsame ClassicAnalyzer in
>>> both cases ) ?ii) is this expected behavior of
>>> ClassicAnalyzer(Version.LUCENE_30) onParser ?iii) what should be done
>>> to avoid query part *(Field://) *?Thanks,Peru.
>>> >>>
>>> >>>
>>> >>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message