lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler" <...@thetaphi.de>
Subject RE: LongField when searched using classic QueryParser doesnot yield results
Date Wed, 11 Jan 2017 12:47:36 GMT
Hi,

this is indeed related to this.

The problem is a missing "schema" in Lucene. If you index values using several different field
types (like TextField vs. IntField/Float/Double...) this information how they were indexed
is completely unknown to the query parser. The default query parser is using legacy code to
create numeric ranges or numeric terms: It is just treating them as text! If it searches on
a numeric field using text terms, it won't find anything.

Solr and Elasticsearch are maintaining a schema of the index. So they subclass the query parser
and override getRangeQuery and getFieldQuery protected methods and using their schema to create
the correct query types depending on the schema. The default is to create TermQuery and TermRangeQuery,
which won't work on numeric fields.

To fix this in your code you have to do something similar. YOU are the person who knows what
the type of Field XY is. If XY is a numeric field, the query parser must check the field name
and then build the correct query (NumericRangeQuery).

Uwe

-----
Uwe Schindler
Achterdiek 19, D-28357 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de

> -----Original Message-----
> From: Amrit Sarkar [mailto:sarkaramrit2@gmail.com]
> Sent: Wednesday, January 11, 2017 9:52 AM
> To: general@lucene.apache.org
> Cc: java-user@lucene.apache.org
> Subject: Re: LongField when searched using classic QueryParser doesnot yield
> results
> 
> Hi Jaspreet,
> 
> Not sure whether this helps to answer your question as I didn't try to run
> the code:
> 
> From official guide:
> 
> > Within Lucene, each numeric value is indexed as a *trie* structure, where
> > each term is logically assigned to larger and larger pre-defined brackets
> > (which are simply lower-precision representations of the value). The step
> > size between each successive bracket is called the precisionStep,
> > measured in bits. Smaller precisionStep values result in larger number of
> > brackets, which consumes more disk space in the index but may result in
> > faster range search performance. The default value, 4, was selected for a
> > reasonable tradeoff of disk space consumption versus performance
> 
> 
> > If you only need to sort by numeric value, and never run range
> > querying/filtering, you can index using a precisionStep of
> > Integer.MAX_VALUE
> >
> <http://download.oracle.com/javase/6/docs/api/java/lang/Integer.html?is-
> external=true#MAX_VALUE>.
> > This will minimize disk space consumed.


Mime
View raw message