lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject Re: Combining keyword queries with database-style queries
Date Thu, 23 Oct 2008 12:47:20 GMT
Well, assuming that token_count is an indexed field
in your documents (i.e. not something you're
computing on the fly), just use a RangeQuery for the numeric
part. Actually, you probably want to use
ConstantScoreRangeQuery...

The only thing you have to watch is that Lucene does a
lexical compare, so you have to index your numbers
as comparable strings, probably left-padding to some
fixed width with zeros, see NumberTools.

Best
Erick

On Thu, Oct 23, 2008 at 8:27 AM, Niels Ott <nott@sfs.uni-tuebingen.de>wrote:

> Hi everybody,
>
> I need to query for documents not only for search terms but also for
> numeric values (or other general types). Let me try to explain with a
> hypothetical example.
>
> Assuming there is a value for the number words in each document (or the
> number of person names, or whatever), I would want to formulate a query
> like "Give me documents containing 'jack johnson' AND with token_count >
> 250".
>
> I've been working with Lucene before and the keyword part is easy, but
> what would be a good solution to query for numbers etc.?
>
> One first idea I had was storing the numbers (which are basically a
> HashMap<String,Double>) in the index in some way or the other. But it is
> not at all obvious for me how to query them then.
>
> Another thing I could think of would be using a separate database of any
> type, but then how to bring those two together in a way that makes sense?
>
> Any pointers to useful resources and any types of hints are welcome! :-)
>
> Best,
>
>  Niels
>
>
> --
> Niels Ott
> Computational Linguist (B.A.)
> http://www.drni.de/niels/
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message