lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Igal @ getRailo.org" <i...@getrailo.org>
Subject Re: Analyzer in QueryParser behaves differently from IndexWriter
Date Sun, 13 Jan 2013 21:03:21 GMT
thanks Erik.

I tried putting the query in "double quotes" and it made some difference 
but still not exactly what I'm looking for.

so what's my best solution?  to avoid using the QueryParser and instead 
"parse" the query myself?  is there a different (better) query parser 
for this situation?


Igal


On 1/13/2013 5:42 AM, Erik Hatcher wrote:
> The analyzer through QueryParser is invoked for each "clause" and thus in your example
it's invoked 4 times and thus each invocation only sees one word/term.
>
>      Erik
>
> On Jan 13, 2013, at 2:13, "Igal @ getRailo.org" <igal@getrailo.org> wrote:
>
>> hi,
>>
>> I've created an Analyzer that performs a few filtering tasks, including creating
Shingles and term Replacements among other things.
>>
>> I use that Analyzer with IndexWriter and it works as expected.  but when I use that
same Analyzer with QueryParser (org.apache.lucene.queryparser.classic.QueryParser) it behaves
differently.  specifically it does not create shingles.  see below the output for a simple
phrase of 4 terms:  "word1 word2 word3 word4"
>>
>> from IndexWriter (shingle terms created as expected -- total of 7 terms):
>>
>> term: word1    term=word1,bytes=[77 6f 72 64 31],startOffset=0,endOffset=5,punct=0,positionIncrement=1,position=1,type=word,keyword=false
>> term: word2    term=word2,bytes=[77 6f 72 64 32],startOffset=6,endOffset=11,punct=0,positionIncrement=1,position=2,type=word,keyword=false
>> term: word1 word2    term=word1 word2,bytes=[77 6f 72 64 31 20 77 6f 72 64 32],startOffset=6,endOffset=11,punct=0,positionIncrement=0,position=2,type=SHINGLE,keyword=false
>> term: word3    term=word3,bytes=[77 6f 72 64 33],startOffset=12,endOffset=17,punct=0,positionIncrement=1,position=3,type=word,keyword=false
>> term: word2 word3    term=word2 word3,bytes=[77 6f 72 64 32 20 77 6f 72 64 33],startOffset=12,endOffset=17,punct=0,positionIncrement=0,position=3,type=SHINGLE,keyword=false
>> term: word4    term=word4,bytes=[77 6f 72 64 34],startOffset=18,endOffset=23,punct=0,positionIncrement=1,position=4,type=word,keyword=false
>> term: word3 word4    term=word3 word4,bytes=[77 6f 72 64 33 20 77 6f 72 64 34],startOffset=18,endOffset=23,punct=0,positionIncrement=0,position=4,type=SHINGLE,keyword=false
>>
>>
>> from QueryParser (shingle terms not created -- only 4 terms):
>>
>> term: word1    term=word1,bytes=[77 6f 72 64 31],startOffset=0,endOffset=5,punct=0,positionIncrement=1,position=1,type=word,keyword=false
>> term: word2    term=word2,bytes=[77 6f 72 64 32],startOffset=0,endOffset=5,punct=0,positionIncrement=1,position=1,type=word,keyword=false
>> term: word3    term=word3,bytes=[77 6f 72 64 33],startOffset=0,endOffset=5,punct=0,positionIncrement=1,position=1,type=word,keyword=false
>> term: word4    term=word4,bytes=[77 6f 72 64 34],startOffset=0,endOffset=5,punct=0,positionIncrement=1,position=1,type=word,keyword=false
>>
>>
>> can anyone tell me what I'm doing wrong?
>>
>> thank you,
>>
>>
>> Igal
>>
>>
>>
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message