lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From VIGNESH S <vigneshkln...@gmail.com>
Subject Problem with MultiPhrase Query in Lucene 4.3
Date Thu, 03 Oct 2013 14:07:43 GMT
Hi,

I am Trying to do Multiphrase Query in Lucene 4.3. It is working Perfect
for all scenarios except the below scenario.
When I try to Search for a phrase which is preceded by any punctuation,it
is not working..

TextContent:  Dremel is a scalable, interactive ad-hoc query system for
analysis
of read-only nested data. By combining multi-level execution
trees and columnar data layout, it is capable of running aggregation

Search phrase :  interactive adhoc

The Above Search is failing because "interactive adhoc" is preceded by ","
in original text.


I am Doing Indexing like this..Sample Code for Indexing.I have used
whitespace analyzer.

Document doc = new Document();

contents ="Dremel is a scalable, interactive ad-hoc query system for
analysis
of read-only nested data. By combining multi-level execution
trees and columnar data layout, it is capable of running aggregation";

FieldType offsetsType = new FieldType(TextField.TYPE_STORED);

Field field =new Field("content","", offsetsType);

doc.add(field);
field.setStringValue(contents);

mWriter.addDocument(doc);

In the Search I am forming MultiphraseQueryObject and adding the tokens of
the search Phrase.

Before Adding the tokens,I validated like this

LinkedList<Term> termsWithPrefix = new LinkedList<Term>(); trm.seekCeil(new
BytesRef(word)); do { String s = trm.term().utf8ToString(); if
(s.startsWith(word)) { termsWithPrefix.add(new Term("content", s)); } else
{ break; } } while (trm.next() != null);
mpquery.add(termsWithPrefix.toArray(new Term[0])); }

It is working for all scenarios except the scenarios where the search
phrase is preceded by punctuation.

In case of text preceded by punctuation trm.seekCeil(new BytesRef(word));
is pointing a diffrent word which actually causes the problem..

Please kindly help..


-- 
Thanks and Regards
Vignesh Srinivasan

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message